Open nachiket opened 1 year ago
I've experimented with the load_in_8bit
argument in GPT2LMHeadModel.from_pretrained
inmodel.py
's from_pretrained()
classmethod, but since I'm using mainly Windows and have an old GTX 1060 locally, I wasn't able to go far.
However, I was able to reduce the output of print(f"memory footprint {model_hf.get_memory_footprint()/1024**2:.2f} MB")
by half.
Is it currently possible to use the int8 type for training and inference with nanoGPT? I only see different floating-point options in train.py. What might it take to add such support if it’s currently unsupported.