Closed felix-schwarz closed 1 year ago
creation-with-upload
extensionWe can implement this on several endpoints:
To directly start an upload at the dataprovider clients would have to send the POST to the datagateway at /data because it needs to look up which dataprovider is actually responsible for the upload.
To implement the create
extension we already do the necessary stat and InitiateFileUpload requests in ocdav and itr should be possiple to support the create-with-upload
extension as well.
The js tus client tries to send data with the initial request and falls back to just create
if the server does not support it. So we should implement it asap. This is an easy win and halves thke number of requests.
expiration
extensionshould be implemented, but we need to agree on default expiry ... 24h?
checksum
extensionThe checksum extension only really makes sense when using multiple PATCH requests or when implementing and using the concatenation
as well, because a failed PATCH request MUST discard the received bytes for the request. We can either implement Add Tus-Min/Max-Chunk-Size headers whis is not yet official but absolutely makes sense, because it would allow the admin to configure a sena chunk size for his deployment. Clients could then send the chunks using PATCH requests with a checksum per request. To consume all of the bandwith the concatenation
extension can be used to allow uploading the chunks in parallel.
To mimic OC10 checksumming, I allow providing a checksum in the metadata that can be checked on the server side if the full file is available.
I prefer using Tus-Min/Max-Chunk-Size headers: it allows resuming and can be tuned by the admin as necessary. No urgent need to go parallel & concatenation right now ... someday, maybe.
If we went to invent a new checksumming extension we should create that as a PR in https://github.com/tus/tus-resumable-upload-protocol.
3.
expiration
extensionshould be implemented, but we need to agree on default expiry ... 24h?
Okay. Same as previous chunking implementations.
post with metadata currently uses metadata to send the checksum: https://github.com/cs3org/reva/pull/674/files#diff-5fec0456a6ea9fb1227335fc8d3f8cfdR150
TUS support development has moved to develop
#61
The SDK should provide support for uploading files via the TUS protocol.
Notable observations from reading the spec:
1. Background NSURLSession friendly
If the server does store as much of the received data as possible, the SDK has an easier time to comply with requirements for
NSURLSession
background queues and avoiding penalties:HEAD
request to retrieve the offset to resume fromNSURLSession
with long delays - can be mostly avoided2. Schedulable
It would be preferable if clients were able to directly start an upload with a single request:
NSURLSession
as a single request if Creation With Upload extension is implementedNSURLSession
with long delays3. Defined expiration
This helps in determining whether an upload should be continued or not - and resume only uploads that are known to still be around. On the other hand, a
HEAD
request would be required anyway before resuming an upload, at which point an expired upload should also become apparent.If the expiration date should be supported and utilized, though, adding support for expiration directly to the OCHTTP system should be considered, with requests being terminated with a new error code in case they have expired before having been scheduled.
4. Checksum troubles
The Checksum extension needs checksums to be provided on a per-request basis, calculated not over the entire file but over the body of the respective upload requests.
This is generally fine, but does not cover the scenario where an upload is interrupted and the server should use the already received bytes:
Possible solutions:
1. Store checksums, check when upload is complete
The specification provides this hint:
The solution therefore could be to store all checksums and only verify them against the respective parts once they have been received in full.
Drawback:
2. Custom header with the full file checksum
A custom header with the full file checksum (f.ex.
OC-Full-Upload-Checksum
) is passed to the Create With Upload extension when the upload is initiated. That would allow verification of the full file once the upload has completed.Drawback:
3. Custom header with the checksum over already transferred data
An additional, custom header with the checksum of the file up to the point the upload resumes from (f.ex.
OC-Transmitted-Upload-Checksum
) would allow the server to check if the data received before is consistent - and allow the server to cancel an already.Drawback:
Drawback mitigation:
HEAD
requests, for which the client should provide a checksum when resuming the upload – allowing the server to accept partial requests while ensuring consistencyPragmatic and performant
A pragmatic and performance-oriented approach would likely by a combination of 1. and 3.
Related issues
Known issues
The current implementation in
develop
has the following known issues: