owncloud / ocis

:atom_symbol: ownCloud Infinite Scale Stack
https://doc.owncloud.com/ocis/next/
Apache License 2.0
1.4k stars 182 forks source link

Summarize Big File Upload for clients with TUS #2627

Open dragotin opened 3 years ago

dragotin commented 3 years ago

The purpose of this ticket is to summarize the way all clients can and should use TUS to upload big files to oCIS for the MVP/GA version of oCIS. This is a cross-component topic between oCIS and all clients.

Please @pmaier1 @michaelstingl @felix-schwarz @TheOneRing @kulmann review the following text that will go into the spec docu later and help to complete.

Also we need to clearify missing functionality that is needed on the clients.


Big File Upload Protocols

Over the time, there have been different protocols to upload big files to EFSS systems.

Chunking V1

Chunking V1 was introduced in a very early version of ownCloud and is still used with some older clients.

Basically it uploads chunks of defined sizes of the file to upload in separate requests. Once all chunks are uploaded the server assembles all chunks to build the final file.

Chunking NG

Chunking NG is a further development of Chunking V1 and gives the client more control over the process. It uploads the chunks into a directory on the server that is exclusive for the upload. After all chunks are uploaded the client initiates a move of the directory to the final destination which causes the file assembly.

By using PROPFIND requests on the upload directory, the client always can find out the state of a certain upload.

TUS

TUS is an Open Protocol for Resumeable File Uploads and is a standard protocol defined at [TUS.io]/(https://tus.io). The code is hosted on Github.

Reva Big File Support

CERNBox Implementation with EOS Backend

For CERNBox there is no TUS at the moment, as EOS works with Chunking V1 directly.

WebDAV Handling of TUS Posts

In Reva, all file uploads are started as TUS requests. For that a POST request is sent with the according TUS headers.

By default, the maximum size of an upload is infinite. This is subject to change, see https://github.com/owncloud/ocis/pull/2584

Now, to configure the maximum size of chunks that oCIS accepts the environment variable STORAGE_FRONTEND_UPLOAD_MAX_CHUNK_SIZE needs to be defined in bytes.

With that parameter set to a spefic value the communication between server and client automatically limits the size of the uploaded parts. A file that is bigger than the limit is transferred in multiple requests, using the TUS protocol. That uses a very flexible sequence of PATCH requests that uploads the parts of the big file. All TUS related processing information is transfered in HTTP headers.

For more details on the TUS protocol please refer to the TUS core protocol.

With the last PATCH request, the server indicates that all data was received and returns a list of headers that the clients can use to optimize the synchronization process.

In particular, in the ownCloud ocdav implementation in Reva, these are

Client Implementation Considerations

These are the client considerations for oCIS MVP that we want to deliver for the GA. The guiding principle here is to provide a robust way to upload big files in a defensive way, which means that it works in most of the infrastructures we find at users places by default without extra config. The amount of needed requests should be as little as possiblef for performance reasons.

Bugreports in that area that are not yet considered here:

https://github.com/owncloud/ocis/issues/214 https://github.com/owncloud/ocis/issues/2626

kulmann commented 3 years ago

Web currently uses the tus-js-client lib in version 1.8.0 for TUS uploads. TUS uploads work with oCIS so far. There were some breaking changes in 2.x (current version: 2.3.0) so we need to schedule some time for updating it.

creation-with-upload is supported server side (already sending data in the initial POST, if I'm not mistaken) but with version 1.8.0 of the js client lib it's not supported client side (see https://github.com/owncloud/web/pull/3436#issuecomment-625125985). If we want to utilize that we need to schedule some time to upgrade to 2.3.0 in Web. That is tracked in https://github.com/owncloud/web/issues/5371 - we also wanted to refactor the upload business logic in web. Debatable (but a good idea, because it's a mess).

IMO it would also be worth looking into js libs that handle uploads as a whole. @butonic brought up https://uppy.io and I still think that it's a good idea to try that out. That part is unrelated to TUS.

TheOneRing commented 3 years ago

Looks good.

pmaier1 commented 3 years ago

Thanks a lot! Looks good :+1:

One thing: Not sure whether this is really relevant here but I found the checksum feature (file integrity checking) missing.

wkloucek commented 3 years ago

@dragotin could you please have a look at https://github.com/owncloud/ocis/issues/1343 and see how it fits in here?

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 10 days if no further activity occurs. Thank you for your contributions.

michaelstingl commented 2 years ago

@dragotin could you please have a look at #1343 and see how it fits in here?

I'd say it's unrelated?

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 10 days if no further activity occurs. Thank you for your contributions.

butonic commented 2 years ago

@dragotin how is this actionable? Move your description to devdocs and close?