Closed aswanthkrishna closed 5 months ago
Hi! We strongly recommend to compress llama 7B on A100. However, it is still possible to run the compression with only 20G GPU memory by the following command:
python SVDLLM.py --step 1 --run_low_resource --ratio COMPRESSION_RATIO --model HUGGINGFACE_MODEL_REPO --whitening_nsamples WHITENING_SAMPLE_NUMBER --dataset WHITENING_DATASET --seed SAMPLING_SEED --model_seq_len MODEL_SEQ_LEN --save_path WHITENING_INFO_SAVING_PATH
What is the minimum hardware resources required to test out this codebase for llama 7B.