int8 training/inference

karpathy / nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

MIT License

36.19k stars 5.65k forks source link

int8 training/inference #163

Open nachiket opened 1 year ago

nachiket commented 1 year ago

Is it currently possible to use the int8 type for training and inference with nanoGPT? I only see different floating-point options in train.py. What might it take to add such support if it’s currently unsupported.

zihaolucky commented 1 year ago

check out https://huggingface.co/blog/hf-bitsandbytes-integration

iokarkan commented 1 year ago

I've experimented with the load_in_8bit argument in GPT2LMHeadModel.from_pretrained inmodel.py's from_pretrained() classmethod, but since I'm using mainly Windows and have an old GTX 1060 locally, I wasn't able to go far.

However, I was able to reduce the output of print(f"memory footprint {model_hf.get_memory_footprint()/1024**2:.2f} MB") by half.