bigscience-workshop / petals

🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
https://petals.dev
MIT License
9.26k stars 526 forks source link

Fix dummy cache allocation #574

Closed artek0chumak closed 7 months ago

artek0chumak commented 7 months ago

Fix allocation of the dummy key_value cache. It's not used in actual computations, but torch checkers require them to be on the correct device.