bigscience-workshop / petals

🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
https://petals.dev
MIT License
8.89k stars 490 forks source link

Manual management of shards #573

Open nrs-status opened 2 months ago

nrs-status commented 2 months ago

Hi, I was wondering if anyone had recommendations of which parts of the library I should look at if I'm interested in how models are sharded, and if there's a way to manage this sharding manually? Most of all, if there's a way to load a shard from an already sharded model, for instance in a private cluster, so that there's a way to know in advance which instance will have which shard. Any reading suggestions in the repo are appreciated!