tus / tus-js-client

A pure JavaScript client for the tus resumable upload protocol
https://tus.io/
MIT License
2.11k stars 316 forks source link

Error "Requested data is before the reader's current offset" for Upload(stream) when resumption creates a new upload #580

Open nh2 opened 1 year ago

nh2 commented 1 year ago

The error Requested data is before the reader's current offset from https://github.com/tus/tus-js-client/blob/01d3948c1a0b96adee545af72cdaabc5c5e979b1/lib/browser/sources/StreamSource.js#L38-L39

is triggered when the following conditions are met jointly:

(These conditions are true in common cases, e.g. when you're behind Cloudflare (requiring chunkSize) and you restart your TUS server software for a split second for an upgrade.)

Then, this code triggers: https://github.com/tus/tus-js-client/blob/01d3948c1a0b96adee545af72cdaabc5c5e979b1/lib/upload.js#L672-L675

which creates a completely new upload with offset 0, but it seems that the ReadableStream is not reset and can already be way further advanced.

Hence, Requested data is before the reader's current offset necessarily triggers.

Automatically creating a new upload in the middle of the transfer seems logically impossible when the Upload is sourced by a ReadableStream, as the user is given no opportunity to create a stream on which another .slice(0, ...) can succeed.

nh2 commented 1 year ago

This issue is made more likely to crop up by issue #579, as that increases the chances that "a transmission error occurs and resumption fails".

Acconut commented 1 year ago

Automatically creating a new upload in the middle of the transfer seems logically impossible when the Upload is sourced by a ReadableStream, as the user is given no opportunity to create a stream on which another .slice(0, ...) can succeed.

Yes, that is correct. There is currently no way how an upload could succeed from such kind of failures. The question is how can we improve this situation in tus-js-client?

a) Just raise a different error saying that we cannot resume in such a case b) Provide some API so that the user can supply a new stream (but I am unsure how often this is actually possible).

nh2 commented 1 year ago

a) Just raise a different error saying that we cannot resume in such a case

Yes, I think the following would make sense:

b) Provide some API so that the user can supply a new stream (but I am unsure how often this is actually possible).

Isn't this theoretically already possible, by providing an arbitrary object as the argument to the Upload() constructor, and then implementing the fileReader option so that it gets the data from somewhere else (e.g. when slice(start, ...) observes start decrease back to 0, the user's slice() implementation could open some new source from somwhere)? See also #583 which I just filed.


As a concrete example of how fileReader can solve more problems than passsing a Reader to Upload():

I found a bug in Chrome "FileReader 5x slower than in Firefox over Windows Share". It limits sequential File read throughput to 20 Mbit/s. A workaround is concurrent batched reading of multiple File.slice()s. I first implemented it with a Reader but then hit this problem here. Only then did I find out that I can implement it with the fileReader option to get both custom IO patterns AND full resumability from offset 0.

Acconut commented 1 year ago
  • Change the error message to explain that Readers cannot be resumed once they progress past the first chunkSize bytes.

I am wondering if we currently have the capabilities to detect this state inside the Upload class. I do not think so. If you have a file or buffer object, you can seek back to the beginning, so we cannot disallow it in all cases. And in the Upload class we do not directly have a property to detect streaming resources (yet). So maybe we need to expose this in the FileReader interface.

Isn't this theoretically already possible, by providing an arbitrary object as the argument to the Upload() constructor, and then implementing the fileReader option so that it gets the data from somewhere else (e.g. when slice(start, ...) observes start decrease back to 0, the user's slice() implementation could open some new source from somewhere)? See also #583 which I just filed.

Good point. That might be, although I never tried it. Plus I think this approach is a rather unintuitive way for most casual tus-js-client users. If we go and embrace a setup where a new stream can be supplied to tus-js-client, I would prefer it being done using a more explicit API.