tus / tusd

Reference server implementation in Go of tus: the open protocol for resumable file uploads
https://tus.github.io/tusd
MIT License
3.05k stars 475 forks source link

Allow using '/' when setting ChangeFileInfo.ID to enable "hierarchival" views in Azure Blob Storage #1021

Closed quality-leftovers closed 8 months ago

quality-leftovers commented 11 months ago

Is your feature request related to a problem? Please describe. Some storages like Azure support a "hierarchical" view based on a seperation character. The default seperation character for azure is '/'

This currently does not work.

It seems the server does not accept the route.

Possible solution

Describe the solution you'd like Allow using / in ChangeFileInfo.ID. Behavior will be storage dependent.

Describe alternatives you've considered Using a different seperation character. This is possible but most tools do not support changing the default character.

Can you provide help with implementing this feature? Yes.

Additional context Add any other context or screenshots about the feature request here.

Acconut commented 11 months ago

Thanks for the report. Interestingly, the same question has been asked just two days ago in the community forum: https://community.transloadit.com/t/tusd-azure-changefileinfo-id-with-folder-support/16871 I started working a bit on it, but the changes can have a significant impact and cause regressions: https://github.com/tus/tusd/pull/1020 So this needs to be tested extensively.

quality-leftovers commented 11 months ago

Thanks for the link. Didn't know about that community.

I wonder whether supporting the path in the URL or just encoding it (base64 / percent) makes more sense. Unless a client inspects the location there should be less risk of regressions.

Great to hear that you are already working on it 👍

Acconut commented 11 months ago

Base64-encoding the upload ID would be possible, but I assume that this breaks a few integrations, which depend on the upload ID being present in the upload URL in clear text. We could put this behind an optional flag, but then users first have to find this non-intuitive flag.

My preference would be on fully supporting / in upload IDs, if possible and feasible.

quality-leftovers commented 8 months ago

One thing I'd like to add is that encoding the id would allow the generation of any id without restrictions (a store could use # or ? in the id).

However you are right that base64 encoding could easily break existing setups. Maybe HTML URL Encoding (abc/38484849 => abc%2F38484849) special characters is an option worth considering. Hopefully this would not break any existing deployments.

Acconut commented 8 months ago

URL encoding is also a possibility, but one has to be careful because proxies will sometimes decode the URL before sending the request to the backend. For example, nginx does this if you don't pay close attention during configuration: https://serverfault.com/questions/459369/disabling-url-decoding-in-nginx-proxy In these cases, the proxy would already decode the URL and then nothing is gained for tusd. If you configure your proxy correctly, this won't happen, but I can imagine that this is overlooked by some users.

One option that I am leaning more towards right now is the opposite: Instead of allowing more characters in the upload ID, we could restrict the upload ID to URL-safe chars (i.e. no ? etc). This ensures that we are always able to extract an upload ID from the path. People usually want those characters in the upload ID because currently the upload ID will also be the filename in the destination storage. To achieve this, we could decouple the upload ID from the filename, so that you can fully customize the filename during the pre-create hook. In the end, people could use / in the filenames (and thus create directories in S3 etc), while we also use upload URL with a predictable format.

quality-leftovers commented 8 months ago

Interesting, thanks for the explanation. Never thought about that this could lead to issues with reverse proxies.

Having the upload id decoupled from the path sounds good. I'd definitely prefer being able to customize the upload filename in hooks.

Do you think about just adding this to the PreCreate hook and the store will "manage it" or would this be an additional hooks for it (e.g. a DecodePathFromId so the .info file can be co-located to the data), ...?

quality-leftovers commented 8 months ago

Noticed some related issues/posts I'd like to link here for completeness/context for anyone reading this issue:

Acconut commented 8 months ago

Do you think about just adding this to the PreCreate hook and the store will "manage it"

I envision this route for now, where the pre-create hook can supply an upload ID and a destination path to place the file in the storage. The storage implementation should than use these values.

DecodePathFromId

The have been requests to allow more customization of the URL format (https://github.com/tus/tusd/issues/340) and we are open for that as well. However, I think that this is not directly related to the feature discussed in this issue.

tooltakes commented 8 months ago

I also need to set a key like /{{.Now.Format "2016/01"}}/{{.Key}}{{.FileExtOrEmpty}} for s3 . One solution is to add a -s3-object-key-template flag which supports text/template. In that case, I don't even need a hook.

Acconut commented 8 months ago

One solution is to add a -s3-object-key-template flag which supports text/template. In that case, I don't even need a hook.

That's also an interesting idea. I could imagine that tusd supports both methods of customizing the URLs in the future.

Acconut commented 8 months ago

A PR has just landed which adds support for forward slashes in URL without the need of encoding the upload ID. Feel free to try it out and let me know if it works.