OpenGVLab / OmniQuant

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
MIT License
663 stars 50 forks source link

Runing Falcon-180B on a single A100 80GB where/what is main.py? #17

Closed silvacarl2 closed 10 months ago

silvacarl2 commented 11 months ago

in the notebook Runing Falcon-180B on a single A100 80GB where/what is main.py?

ChenMnZ commented 11 months ago

main.py indicates https://github.com/OpenGVLab/OmniQuant/blob/main/main.py. You only use this when you want to train the quantization parameters by yourself.

silvacarl2 commented 11 months ago

got it, so at the top of the notebbook, anything related to this can be commendet out:

!CUDA_VISIBLE_DEVICES=0 python main.py \ --model /PATH/TO/Falcon/falcon-180b \ --epochs 40 --output_dir ./log/falcon-180b-w3a16g512 \ --wbits 3 --abits 16 --group_size 512 --lwc --aug_loss \ --nsamples 32

etc.