supabase / storage

S3 compatible object storage service that stores metadata in Postgres
https://supabase.com/docs/guides/storage
Apache License 2.0
768 stars 108 forks source link

Special characters in filename causes uploads to fail #133

Open jet10000 opened 2 years ago

jet10000 commented 2 years ago

Bug report

When upload "望舌诊病.pdf"

Describe the bug

image
thebengeu commented 2 years ago

Thanks for the bug report @jet10000! We'll take a look.

rahul3v commented 2 years ago

@thebengeu @jet10000 @alaister As per now for both objectName and bucketName , supabase only allow s3 safe characters as per AWS guideline here

 // only allow s3 safe characters and characters which require special handling for now
 // https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-keys.html
alaister commented 2 years ago

S3 supports UTF-8 characters in filenames. However, at the moment, we are very strict with which filenames we allow. I think this is a valid use case to add support for different languages in filenames.

One option is to update the isValidKey function https://github.com/supabase/storage-api/blob/9480891af024396c58045578d16c91778aae67d2/src/utils/index.ts#L76-L80 to allow everything aside from the characters outlined in https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-keys.html#object-key-guidelines-avoid-characters

PiddePannkauga commented 2 years ago

I would love support for å,ä,ö.

Love from Sweden

Xuanwo commented 1 year ago

Hi, opendal meets a similiar problem in https://github.com/apache/incubator-opendal/pull/2190 that we have a test case (which passed on most storage platforms from s3, gcs, azalob to hdfs) like the following:

let path = format!("{} !@#$%^&()_+-=;',.txt", uuid::Uuid::new_v4());

Does this case make sense to you? I'm willing to help fix this. I believe that all URL-unsafe characters should be percent-encoded, and the server-side should handle the job of decoding them.

javierfern03 commented 8 months ago

I would love support for ´ I can't use words with accents in my images, which is very common in Spanish.

turulix commented 8 months ago

Also ÖÄÜ in german

orlein commented 8 months ago

I could understand that the isValidKey function looks like because it would be safe if the system accepts only the alphabet. As I'm Korean, there is no good way to safely convert from the Korean characters(hangul) to alphabet. It would be same for the Japanese characters and Chinese characters too. Are there any specific reasons for the function's regex? If the function accepts the encoded characters from encodeURIComponent function, it would be great.

li4man0v commented 7 months ago

MacOS generates screenshot names that don't match the pattern in isValidKey. For example, "Screenshot 2024-01-24 at 12.25.39 AM".

jingsam commented 5 months ago

Ops! This issue seems easy to fix but have last for 2 year's. Unbelievable!

ThaddeusJiang commented 5 months ago

hi everyone

My solution is base64 encoding when uploading file, demo: https://github.com/ThaddeusJiang/supabase-helpers/blob/main/backup_storage_buckets.ts#L71-L73

gitnik commented 4 months ago

This is a big pretty big oversight. At the very least it would be nice if this was documented somewhere

stefan-girlich commented 3 months ago

You can upload files using any name (e.g. base64, thx @ThaddeusJiang ), store the original file name in a different table, and then use it when generating the public link:

const {
  data: { publicUrl },
} = supabase.storage
  .from('my_bucket')
  .getPublicUrl(internalFileName, { download: originalFileName })

Source (for URL access): https://supabase.com/docs/guides/storage/serving/downloads#downloading The supabase-js docs unfortunately do not mention this yet: https://supabase.com/docs/reference/javascript/storage-from-getpublicurl

jingsam commented 2 months ago

As a non-english supabase developer, it is painfull that I can not upload assets other than alphabeta filenames. I hope the supabase teams take this issue with priority, as this issue has not fixed after two years and so many people complains on it.

jingsam commented 2 months ago

As this blog post says that Supabase Storage is now S3 compatible. I think only supporting a subset of s3 valid characters for object names is not fully compatible with s3.

logemann commented 2 months ago

came here to see that we cant store UTF-8 as filename... crazy. Not even ISO-8859-1 because german umlauts are easily in there as well. The only hope i guess is the new metadata feature which should arrive soon in the JS lib. Proposals like creating a db table for storing filenames is a non-starter becasue the syncing of storage and db tables is virtually impossible.

leonlazic commented 1 month ago

I would like to use special characters in names as well. Trying to name images by the patient name and here in Balkans it's common to have čšžć in the first or last name.

Seems odd to me that this hasn't already been addressed.

sajadmh commented 1 month ago

Another case: A user tried to upload a file with a backtick and it threw an InvalidKey error.

cmantsch commented 1 month ago

Are there any news on this? I have it even fail on characters like a %, which is not even unicode

renjiali commented 1 month ago

@thebengeu @jet10000 @alaister 目前,对于objectNamebucketName,supabase 仅允许按照此处的AWS 指南使用 s3 安全字符

 // only allow s3 safe characters and characters which require special handling for now
 // https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-keys.html

supabase really should study the products from China, this is the product using supabase, but solved the problem of file name. How supabase was unable to solve the problem. Then request and copy it! The address is https://memfiredb.com/

TechQuery commented 1 month ago

Bug report

When upload "望舌诊病.pdf"

Describe the bug

image

supabase really should study the products from China, this is the product using supabase, but solved the problem of file name. How supabase was unable to solve the problem. Then request and copy it! The address is https://memfiredb.com/

@renjiali It's good for you to fix this problem, but please send a pull request to the upstream project instead of spamming in this issue.

orlein commented 1 month ago

I could understand that the isValidKey function looks like because it would be safe if the system accepts only the alphabet. As I'm Korean, there is no good way to safely convert from the Korean characters(hangul) to alphabet. It would be same for the Japanese characters and Chinese characters too. Are there any specific reasons for the function's regex? If the function accepts the encoded characters from encodeURIComponent function, it would be great.

I've posted the comment to know that there is any conflicts if I change the function isValidKey to accept encodeURIComponent. I'm pretty sure that there is some reason for not accepting it because Supabase team seems to be quite reasonable.

And @renjiali, please DO NOT SPAM. If there is any comment added on this issue, the watchers get the mail for each of the comments.

renjiali commented 3 weeks ago

我可以理解该isValidKey函数的样子,因为如果系统只接受字母表,它会很安全。 因为我是韩国人,所以没有很好的方法可以安全地将韩文字符(hangul)转换为字母表。日文字符和中文字符也一样。 函数的正则表达式有什么特殊的原因吗?如果函数接受来自encodeURIComponent函数的编码字符,那就太好了。

我可以理解该isValidKey函数的样子,因为如果系统只接受字母表,它会很安全。因为我是韩国人,所以没有很好的方法可以安全地将韩文字符(hangul)转换为字母表。日文字符和中文字符也一样。函数的正则表达式有什么特殊的原因吗?如果函数接受来自encodeURIComponent函数的编码字符,那就太好了。

我已经发布了评论,知道如果我将函数更改isValidKey为 accept是否会发生冲突encodeURIComponent。我很确定不接受它肯定是有原因的,因为 Supabase 团队似乎很通情达理。

和@renjiali,请不要发送垃圾邮件。如果对此问题有任何评论,观察者将收到每条评论的邮件。

我可以理解该isValidKey函数的样子,因为如果系统只接受字母表,它会很安全。因为我是韩国人,所以没有很好的方法可以安全地将韩文字符(hangul)转换为字母表。日语和字符中文字符也一样。函数的正则表达式有什么特殊的原因吗?如果函数接受来自encodeURIComponent函数的编码字符,那就太好了。

我已经发布了评论,知道如果我将函数更改isValidKey为接受是否会发生冲突encodeURIComponent。我很确定不接受肯定是有原因的,因为 Supabase 团队似乎很通情达理。

和@renjiali,请不要发送垃圾邮件。如果此问题有任何评论,观察者将收到每条评论的邮件。

Just because you think it's spam doesn't mean anything. Just like you head Chinese culture, such as Tai Chi Dragon Boat Festival and so on. You don't have any facts. I, on the other hand, believe that South Korea has no sovereignty, just as it is stationed by the US father

encima commented 2 weeks ago

@renjiali This is a space for helping and working with users of Supabase, it is not a place for politics or insulting users. Your comment has been hidden and marked as abuse now and, if this is reported again, you will be blocked from posting to this community. Please be considerate

efd1 commented 3 hours ago

hi everyone

My solution is base64 encoding when uploading file, demo: https://github.com/ThaddeusJiang/supabase-helpers/blob/main/backup_storage_buckets.ts#L71-L73

Thank you! Your link is broken, but I used btoa() and it's working well!