tus / tusd

Reference server implementation in Go of tus: the open protocol for resumable file uploads
https://tus.github.io/tusd
MIT License
2.93k stars 465 forks source link

Cleaning up .info files #1076

Open pcfreak30 opened 5 months ago

pcfreak30 commented 5 months ago

Question In the S3 store, I can see terminate does a cleanup but it also assumes all multipart files exist and may error if they are not found.

I have already created a solution, but I would like to know why, in the S3 store, or maybe in general, .info isn't deleted after an upload is completed. In my case, a cron job was unacceptable, and it seems like the .info is transient, not permanent.

Setup details Please provide following details, if applicable to your situation:

Acconut commented 5 months ago

The info files include additional information (such as the meta data or whether the upload is a partial upload used for concatenation only), which might be necessary for post-processing the uploaded files. Thus, tusd does not remove them automatically. Neither for s3store nor for any other storage.

If you want to remove them, I would recommend to either use the post-finish hook to do so, or to use S3 lifecycle rules to clean .info files after a fixed period (e.g. 1 or 2 days).

Once tusd has support for the expiration extension, tusd should have better functionality to control when uploads are cleaned up. But this has not be implemented yet.

pcfreak30 commented 5 months ago

I already am? https://git.lumeweb.com/LumeWeb/portal/src/commit/e034e1d54ed4f31290c761a4c393ff65912fa93a/storage/storage.go#L376orage.go#L376 using the CompleteUploads hook. However, there are no means to ask it to remove besides Terminate, which will likely be an error if it can't delete the multipart pieces.

And I cannot rely on any s3 provider rules; the daemon needs to handle it.

ATM, I do it via keeping my own s3Client instance, but that's a hack atm.

Acconut commented 5 months ago

Once an upload is finished, tusd provides you with the uploaded file as well as additional information in the info file. It's up to the user to decide how these files should be processed further and tusd doesn't assume how or if that's even done.

Currently, using the hooks/channels plus your own S3 client is the best option to manage those files, yes.

Once we start working on adding proper support for the expiration extension, we will likely have functionality for managing the lifecycle of uploads.