ucbrise / actnn

ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training
MIT License
196 stars 30 forks source link

Transformer Benchmarks? #16

Closed CrazySherman closed 3 years ago

CrazySherman commented 3 years ago

Just curious, have you tested this method on transformer-like benchmarks like BERT etc and test the quantization accuracy?

cjf00000 commented 3 years ago

This repository currently does not support transformers officially. In another project, we do test the method for BERT. The "L2" strategy should work losslessly.