Open touta opened 8 years ago
Unfortunately there is no solution without breaking backwards compatibility, but the current behavior is bad enough to do it anyway.
We have to subclass and workaround cgi.FieldStorage (see #852) and can introduce a better unicode file name handling while doing so.
I'm just trying update my product's bottle.py 0.11 to 0.12; and found objects in
Request.files
changed to new classFileUpload
fromcgi.FieldStorage
.They are not compatibile and have difference with
filename
:cgi.FieldStorage
returns client-side file nameFileUpload
returns SAFE-FILE-NAME for saving which cut off non-ASCII and path separatorThose changes make hardly problem in multi-byte culture. If user upload file with non-ASCII named file, we got only extension of filename with
FileUpload.filename
.For example:
あいうえお.txt
v cut off non-ascii.txt
v strip '.'txt
We can also use raw_filename insteadly, but
filename
of non-ASCII named become meaningless and less safe for name uniqueness.For problem 1: Renaming
raw_filename
/filename
tofilename
/safe_filename
is most best solution, I think. Most of non-expert user usesave
method. Or use 'safe_' prefixed for saving. Thus security affect of this changes would be minimal.FYI, as for my product, files are managed with sequence number. Filenames are stored on DB for user can identify them. so no need to make filename safe.
For problem 2: Apply percent or some other escaping instead of cut off, is suboptimal to continuing current way. This may also solve part of problem 1, escaped strings is a enough hint to use
raw_filename
.Anyway, cutting off is bad idea and hope fix this. Naming file with only Japanese characters is very neutral for ordinaly people in Japan. Might be same in Chinese or other non-ASCII countries.
thanks
seems related: https://github.com/bottlepy/bottle/issues/582