Remove smaller limit for legacy bfloat16 serialization

bigscience-workshop / petals

🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading

https://petals.dev

MIT License

8.89k stars 489 forks source link

Open borzunov opened 10 months ago

borzunov commented 10 months ago

Revert #251 since it's not needed after #311. This may improve fine-tuning efficiency for medium-sized batches.

TODO:

[ ] Test it with increasingly larger batches. Watch that we switch from rpc_forward to rpc_forward_stream (can be distinguished using server logs) without errors.