issues
search
xijiu9
/
Train_Transformers_with_INT4
131
stars
4
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
the paper mentioned that all linear ops are quantized into int4, what about gradients in mat-multiply ops in the attention module? Float or int4?
#4
brisker
opened
1 year ago
1
import quantize_forward_easy as qfe ModuleNotFoundError: No module named 'quantize_forward_easy'
#3
python-doggg
opened
1 year ago
2
LORA or Adapter training in distributed setting
#1
brijesh-6899
opened
1 year ago
0