Open rolf-moz opened 1 week ago
I see a progress callback for a file, but typically there are multiple files, and downloads of all files may not initiate immediately. If there is an event that has info on all files (at least once at start) then we will have the ability to display a single bar for the download progress as a whole.
When using hub to download huggingface repo, the real-time progress of each downloaded file will be continuously called back
The current download status of each file is recorded in the ProgressInfo type.
eg.
I see a progress callback for a file, but typically there are multiple files, and downloads of all files may not initiate immediately. If there is an event that has info on all files (at least once at start) then we will have the ability to display a single bar for the download progress as a whole.
I just had to build this myself.
While it's not 100% what you want ... It's close enough (imo)
You can just write conditionals in the callback to keep track of the states (initialized, download started, progress, done) and when you hit initialized / initial download you know you have a new file and need to sum the new total to the overarching progressbar iff you haven't already done so -> on each progress update adjust bars as needed given the byte difference.
For the prior you just track (loaded - lastLoaded) in the conditional for "progress" state to get the new byte difference to add to the overarching sum and / or for that specific file's progressbar.
I was able to build an overarching progressbar + download states for each individual file in the queue with full UI updates. I think it looks and works fairly well. But, each to their own.
That being said. An overarching total is better UI because doing the prior have some very small differentials in the total size creating some jank as new files / time spread rolls in during the process. So, +1 and a data point from someone that just had to do this.
What you can build with the current info with a bit rough styling w/ a rough UI
Here is a loom - bit janky w/ slow CPU sorry about that: https://www.loom.com/share/68eba0e9f63d49bba2b1fa8d566c24e5?sid=bc2991ad-6336-4b7c-8363-ea7a36af9b36
Overall ... Not bad. Removing jank w/ total size returned and then this is solid w/o any issues.
Feature request
Return file size information prior to model file download to enable better UI when downloading multiple model files.
After loading the config files, we fire a single 'session_info' (name subject to change) which contains an object like this:
where each info element contains info like file name, file size, url, etc.
Clients can then use this to do different style of downloading UI (i.e. total percentage downloaded of all model files)
Motivation
As we look to integrate transformers.js with more UI we may want to have a more predictable experience downloading models. Right now the callbacks don't have enough information for a single progress bar downloading the model.
Your contribution
I may be able to help code this.