For large files (100MB+), downloading and storing the entire file in-memory is prohibitively expensive, and may not be needed if only small parts of the file are needed.
It would be great to support HTTP range requests in both synchronous and asynchronous modes.
Some important questions:
How large should the chunks be? Should that be configurable?
Should we prefetch chunks? What prefetch policies are useful?
What configuration knobs should we expose to the user?
What eviction policy should we use? FIFO? How much data should we keep in-memory?
Do browsers cache data retrieved via HTTP range requests? This should make re-acquiring previously evicted blocks much cheaper / avoid draining the server over a long session.
I don't anticipate being able to work on this anytime soon, but I'm recording my thoughts here for now.
For large files (100MB+), downloading and storing the entire file in-memory is prohibitively expensive, and may not be needed if only small parts of the file are needed.
It would be great to support HTTP range requests in both synchronous and asynchronous modes.
Some important questions:
I don't anticipate being able to work on this anytime soon, but I'm recording my thoughts here for now.