Closed MNeMoNiCuZ closed 1 week ago
Suppor model quantization with different precision.
Now supports "unsloth/llama-3-8b-bnb-4bit" Change the setting: LOW_VRAM_MODE = True # Option to switch to a model that uses less VRAM
Suppor model quantization with different precision.