remotestorage / remotestorage.js

⬡ JavaScript client library for integrating remoteStorage in apps
https://remotestoragejs.readthedocs.io
MIT License
2.31k stars 141 forks source link

Efficient large file upload? #1104

Open rchrd2 opened 6 years ago

rchrd2 commented 6 years ago

Hello,

I attempted to upload a large file (approx. 100mb), and my browser became very slow.

I wish I could provide more details, but have large files been considered in this implementation? What is the expectation?

raucao commented 6 years ago

Currently, large file uploads are not something that the library has been optimized for. It would load the whole file in memory and do a single HTTP request, which will likely time out, before it can finish. You can increase that timeout for this reason (as e.g. Sharesome does, because you oftentimes want to upload e.g. larger animated gifs or zip files and whatnot).

As of now, large file uploads are a general hole in the RS spec, as it's currently not really possible to have it in the protocol without prescribing implementation logic, and it's not something that HTTP offers by itself. See spec issues https://github.com/remotestorage/spec/issues/124 and https://github.com/remotestorage/spec/issues/131 as well. Maybe you have an idea for a solution even...

rchrd2 commented 6 years ago

Thanks for the response. I'll need to dig into the RemoteStorage client implementation to provide more concrete suggestions, but here's some ideas.

Without solving all the problems, one improvement, might be to circumvent holding the whole file in memory when uploading. Like passing the stream directly to client.storeFile, instead of the raw value. Then this could be passed separately to the remote and cache storages.

I am also curious what performance characteristics of IndexDB is for large files. If there is a hard-limit there, then it's probably worth mentioning in the docs, because this is a constraint in the library.

raucao commented 6 years ago

Without solving all the problems, one improvement, might be to circumvent holding the whole file in memory when uploading. Like passing the stream directly to client.storeFile, instead of the raw value. Then this could be passed separately to the remote and cache storages.

We'd have to switch to Fetch first then, which would break compat with a ton of browsers, and the spec is not even finished yet. Or maybe XMLHttpRequest accepts a stream for sending requests and I don't know about it?

I am also curious what performance characteristics of IndexDB is for large files. If there is a hard-limit there, then it's probably worth mentioning in the docs, because this is a constraint in the library.

Afaik there is no hard limit. But I think it makes sense to explain somewhere, that the library is not optimized for storing large files, but for application data.

Caching large files like that in your browser is not a great idea in the first place, because browser caches can not be trusted with persistence, in the same way your local filesystem can. Both browsers/OSs as well as users can periodically delete their caches, and you wouldn't want to randomly have to re-download your RAW photo library when that happens. There used to be a FileSystem API for this use case (which would also give you URLs, so you could e.g. play videos straight from the sandbox filesystem), but after only having been implemented in Chrome at the time, it has eventually been forgotten and discontinued. I think it's technically still available in Chrome, but no other other browser.

raucao commented 5 years ago

Just an update, because I found this open issue: we're in the process of adding support for resumable/large file uploads: https://community.remotestorage.io/t/resumable-file-uploads/461