nextcloud / server

☁️ Nextcloud server, a safe home for all your data
https://nextcloud.com
GNU Affero General Public License v3.0
27.13k stars 4.03k forks source link

Disallow illegal characters `/ ? < > \ : * | "` and so on in file and folder names #27891

Closed RuudschMaHinda closed 2 years ago

RuudschMaHinda commented 3 years ago

Nextcloud allows characters in file names and folder names, which are illegal in some operating systems, or all operating systems.

For some reason users in one of my installs use Asterisks, some use colons, I even found pipes and question marks, let alone quotation marks, and ampersands.

Nextcloud itself should not allow naming files or folders with those characters in them. Windows clients basically refuse to work if a folder name has a colon in them. Renaming said folder stops synching alltogether because windows can't handle it.

I suggest to not allow those special characters when creating files and folders within the webinterface. Files and folders with those characters in the sync folder should be renamed and said character removed before it gets distributed to other clients on different OSs.

To be honest I don't even know how some users manage to get some of these characters in there. It's not been the first time, and I just wrote a third email about this issue to this team. They are very learning resistant.

alx-tuilmenau commented 3 years ago

I find this difficult because there is a problem with forbidding characters when non-Windows clients synchronize and these characters are used in the file names. In this case, the sync client on Linux/other (where filenames can contain any character) must handle this, maybe running in conflicts, if 2 names are only differ in a special character. Wouldn't it be better if the Windows client does this, because Windows is the only system which has problems with this? When downloading / saving a file, the browser in Windows simply throws away all special characters from the file name in the content disposition header that are not allowed.

RuudschMaHinda commented 3 years ago

I don't know where one would start to address this problem. Fact is, in a heterogenous OS environment all the Windows users that are locked in to Windows because of certain vendors, get a messed up nextcloud-folder on their drive.

Try it out: Create a Folder name with a colon in the middle of it. "Project name : YYYY" put the year in for example. Then use the windows client. Then rename it from the web-interface, or even the original OS you created that on. The folder on the windows drive will be messed up, and the client throws errors.

This is making it unusable for the computer illiterate who apparently never visited Basics in Computer Use 101. We have to account for them somehow. No matter how many emails I write, they mess it up every few months and then start complaining that their nextcloud isn't working.

Basically because of this, I can't with all confidence say that the nextcloud client is safe to use on Windows. Maybe it is actually a "bug"? Because it makes the windows client bug out. Might open a bug report on the client team, if this is not a feature the server crew can address. But since those characters can't be used in Windows, I doubt they can do anything about it at all. Which is why I put it as a feature request.

alx-tuilmenau commented 3 years ago

I tried to reproduce with Windows Client 3.2.3: It's ignoring files and folders with a ':' or '*' in its name, it does not show or sync them. There is only a message, "files from ignore list and symblic links are not syncronized" (my ignore list is emtpy, maybe this is set internal somewhere). Renaming a folder in the WebUI from a "normal" name to a name with a colon removes the folder during the next sync on windows completely.

Getting a message on renaming in WebUI (like the message if the new name contains "/") , which can be deactived in the (personal) settings, may be a good solution. Renaming in the WebUI is not the problem that I wanted to address with my first comment, but that there are problems with clients on other platforms if these characters in filenames are no longer supported in the backend.

Now, the backend supports almost all characters in filenames. Of course, Windows can only use a part of it, but the Windows client will also never create names with forbidden characters. Other clients may use their own character set. If you limit the backend, every linux/android/macos/other client has to map its local names to limited character set, if some of the windows special chars in it. Of course, if you want to sync between different OS, you should only use safe characters.

My last note (maybe offtopic), there are a lot of other windows "features", you may also want to block on windows, like device names (nul, con, ...) or special folders with GUID names ( CP.{ED7BA470-8E54-465E-825C-99712043E01C} ). Or file names which differ only in uppercase / lowercase.

alexanderdd commented 3 years ago

related forum discussion: https://help.nextcloud.com/t/windows-illegal-filename-characters-cant-find-the-github-bug-report/117924/7

RuudschMaHinda commented 3 years ago

A new discovery: Folders with a Space as the last character do not show up in the Win-Sync-Client - Same customer.

solracsf commented 3 years ago

@mgallien @FlexW maybe you have some input here?

szaimen commented 3 years ago

I am not sure if this feature request is feasible.

cc @nextcloud/server-triage for more opinion on this.

RuudschMaHinda commented 3 years ago

concerning feasability: Have a pop-up in the web interface that lets the user know "You can not use / ? < > \ : * | " or have a leading or trailing space in folder and file names." and just not allow finishing the naming of the file or folder until the input criteria have been met.

skjnldsv commented 3 years ago

For the front yes, but coming from the dav backend, it's different

RuudschMaHinda commented 3 years ago

I still maintain that this is something that must be addressed. Until then, I created a workaround for myself:

Using Flow External Scripts I created the following rule:

When file is created, renamed, copied and User is not member of Temp-Ban-Group Run script rename 's/[\?\<\>\:\@\*\|\$]/_/g' /path/to/datafolder/%n && php /path/to/nextcloud/occ files:scan %o

This replaces any of the offending characters with an underscore, and then rescans the userfolder. This can take a while depending on the amount of files and folders. It also depends on the hardware. In my case I had to install rename with apt install rename.

This also seems to be working with group folders.

Edit: This does NOT work with folder names - only files! Flow currently can't be triggered by folders being created/uploaded/renamed.

ghost commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity and seems to be missing some essential information. It will be closed if no further activity occurs. Thank you for your contributions.

allexzander commented 3 years ago

It would be also nice to introduce a trimming mechanism such that all new uploads would get sanitized of trailing spaces. Currently, it is bringing problems to Windows users when syncing down files/folders created with trailing spaces via the Linux terminal and some of the command-line tools. @AndreasErgenzinger Could we maybe add this to the board, or, at least add one card for the trailing space trimming feature? @tobiasKaminsky @mgallien @FlexW

mgallien commented 3 years ago

@mgallien @FlexW maybe you have some input here?

we are going to have to do the file name sanitation on the desktop client side to ensure proper integration with End to End Encryption support from the server would be nice to have (like proper propagation of the same setting to all Files clients)

ghost commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity and seems to be missing some essential information. It will be closed if no further activity occurs. Thank you for your contributions.

RuudschMaHinda commented 2 years ago

Welp, the windows client now at least gives error messages about illegal characters. I also gathered some commands and installed "rename". Will write a script with these commands that triggers every night.

The commands I have gathered (also includes some legal characters):

find /path/to/data/ -name '*\?*' | rename -v -n 's/[\?\!\<\>\:\@\*\|]/_/' #replaces ? with _
find /path/to/data/ -name '*\:*' | rename -v -n 's/[\?\!\<\>\:\@\*\|]/_/' #replaces : with _
find /path/to/data/ -name '*\\*' | rename -v -n 's/[\?\!\<\>\:\@\*\|]/_/' #replaces \ with _ -- might be useless?
find /path/to/data/ -name '*\**' | rename -v -n 's/[\?\!\<\>\:\@\*\|]/_/' #replaces * with _
find /path/to/data/ -name '*\>*' | rename -v -n 's/[\?\!\<\>\:\@\*\|]/_/' #replaces > with _
find /path/to/data/ -name '*\<*' | rename -v -n 's/[\?\!\<\>\:\@\*\|]/_/' #replaces < with _
find /path/to/data/ -name '*\"*' | rename -v -n 's/[\?\!\<\>\:\@\*\|]/_/' #replaces " with _
find /path/to/data/ -name '*\|*' | rename -v -n 's/[\?\!\<\>\:\@\*\|]/_/' #replaces | with _
find /path/to/data/ -name '*\!*' | rename -v -n 's/[\?\!\<\>\:\@\*\|]/_/' #replaces ! with _
find /path/to/data/ -name '* ' | rename -v -n 's/ *$//' #deletes trailing spaces
find /path/to/data/ -name ' *' | rename -v -d -n 's/ *//' #deletes leading spaces
find /path/to/data/ -name '*.' -type f,d | rename -v -n 's/.$//' #deletes trailing periods
sudo -u www-data php7.4 /var/www/nextcloud/occ files:scan --all &&  sudo -u www-data php7.4 /var/www/nextcloud/occ files:scan-app-data

Observe the -n after the rename command .. this makes this a test run! Once you're happy with the shown result, remove the -n. Also the rename commands need to be cleaned up still for just the characters searched for? Some commands must be run several times until all - for example - trailing periods have been deleted. i.e. "foldername with open sentence..." needs the trailing periods command to be run three times. Haven't worked out the script yet.

labor4 commented 2 years ago

I wanna add another perspective (Macos Sync Client Side)

my opinion as a sysadmin/senior macos end user supporter (no DEV) the topic scares me. if there was no facility to offload this whole old story, I would rather enforce legalization client side, warn the user, and try to respect only characters that are derived from language/culture. my customers would accept the lecturing as long as their files are safely handled down the pipeline. Macos user are nowadays often forced to work with real samba (non-macos servers), so it is known.

I am sceptic against a full-auto-rename method however, because sometimes folders of files are compendiums of a project that are then unlinked without knowing. Edge case, but they multiply strongly.

One maybe not so bad solution could be, to:

That way a user could see whats going to happen, and sign off the action.

alexanderdd commented 2 years ago

@acsfer @skjnldsv @szaimen can you please reopen? Was closed by stale bot.

alexanderdd commented 1 year ago

Can someone please triage? Or does someone know who to ping so it can be triaged? What info is still needed (label needs info)?

This was closed by stale bot, imo it is still relevant and not solved.

hardviper commented 1 year ago

@acsfer @skjnldsv @szaimen Please reopen, issue is still relevant.

szaimen commented 1 year ago

Please open a new issue for this with up-to-date information. Thanks!

khoschi commented 1 year ago

Well, this is even worse, as you can trap linux shell escaping on the server side.

  1. webinterface foldername -> filesystem foldername
  2. default escape is nothing -> bla
  3. illegal chars result still in ticks -> 'bla*bla'
  4. use ticks in filename results in doublequotes -> "bla'bla"
  5. use ticks AND doublequotes to add backslash -> 'bla" '\'' "bla' (this folder in webinterface is bla doubletick tick doubletick space bla)
  6. So now go for full monty and use tick, doublequotes, space and backslash to have escaping fail -> error message.

Edit: github webinterface will break as well when trying to quote the resulting string, as # will be replaced. Try yourself.

So you have code to fix typography in place, but I suggest to block all special chars from all operating systems by default.

In scripting languages like bash or php every not-so-perfect input escaping while reading folders from nextcloud will end in desaster. And of course the clients will stop working for versions 2-4. And you break every other account by sharing these folders into them.

hardviper commented 1 year ago

@khoschi This issue is closed. I propose to go to a similar one created later, which is open.