Closed msaroufim closed 3 months ago
The code is out, it's quite simple and short
Opening this so I can track how to add this to ao and make sure it works well with torch.compile(). This will likely need blackwell to perform decently
https://github.com/microsoft/unilm/blob/master/bitnet/The-Era-of-1-bit-LLMs__Training_Tips_Code_FAQ.pdf
@msaroufim can we close this now given that we have the bitnet work being tracked elsewhere?
The code is out, it's quite simple and short
Opening this so I can track how to add this to ao and make sure it works well with torch.compile(). This will likely need blackwell to perform decently
https://github.com/microsoft/unilm/blob/master/bitnet/The-Era-of-1-bit-LLMs__Training_Tips_Code_FAQ.pdf