Fix dummy cache allocation

bigscience-workshop / petals

🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading

https://petals.dev

MIT License

9.26k stars 526 forks source link

Closed artek0chumak closed 7 months ago

artek0chumak commented 7 months ago

Fix allocation of the dummy key_value cache. It's not used in actual computations, but torch checkers require them to be on the correct device.