pytorch / ao

The missing pytorch dtype and layout library for training and inference
BSD 3-Clause "New" or "Revised" License
459 stars 60 forks source link

Bitnet 1.58 prework, POC, and staging #281

Open CoffeeVampir3 opened 2 months ago

CoffeeVampir3 commented 2 months ago

Bitnet 1.58 Groundwork

After some talks with Saroufim and the cuda mode team working on bitnet, we've outlined a strategy for implementing bitnet 1.58 method into torch. This issue lays the groundwork for 2-bit trinary tensor quantization and bitnet linear work for Bitnet 1.58

I've set up a staging repo Staging with a number of items:

This covers the initial groundwork for getting working trinary networks into torch.

msaroufim commented 2 months ago

Very cool! Thank you for writing up such a clear plan. We can start merging the bit-packing logic and the layer quantization so feel free to send a PR whenever you're ready. This is very much following a similar to the playbook for fp6 @gau-nernst followed

Related work

CoffeeVampir3 commented 2 months ago

:+1: https://github.com/pytorch/ao/pull/285 I've kept this seperate from Andreas' commit for now, this encapsulates only the working bits.