Closed kerthcet closed 3 months ago
/kind feature /priority important-soon /assign /milestone v0.1.0
Some libs like https://github.com/coreweave/tensorizer/tree/main has optimized methods to download models, but written in python, I think python is the most popular language in AI world, so we may be careful about this approach.
Considering the hf-hub written in rust download the model in synchronous ways, the benefit is small because we can also use multi-threads in python. Let's keep using python instead at this moment.
What we found is usually this download rate is limited by the NAT of cloud vendor, like 200Mbps, which equals to 25MB/s, so the optimization is somehow useless.
/remove milestone
/milestone clear /close
Because of the huge size of model weights and the GIL lock of python, let's move to rust instead for performance save.