Closed GolemXlV closed 9 months ago
The setup looks okay, and the fact that it failed on layer 66 suggests it's quantizing alright. It does look like the job was corrupted, though, since it's failing to load the hidden state checkpoint. My guess would be that there's an input_states.safetensors
file in the work directory that has a size of zero, or something along those lines. Hard to say why this happened, though. Maybe you're low on disk space or system memory?
Sadly if that is the case I'm not sure there's a way to recover the job. It looks like you might need to start over. You should be able to save the measurement.json
file from the work directory, though. Pass it to the quantizer with -m
along with an empty work directory (-o
) and you can skip the measurement step at least.
Yeah, you're right, thanks.
1638488 models/temp/exl2/cal_data.safetensors
0 models/temp/exl2/input_states.safetensors
8860825 models/temp/exl2/job.json
8221517 models/temp/exl2/measurement.json
36864 models/temp/exl2/out_tensor
I add measurement.json
and it's finally finished successfully. I'm not sure what is really happened, but looks like the problem was on my side.
Awesome work, by the way!
Hi, When trying to convert llama 2 70b on custom calibration dataset on RTX 3090 (24gb) an error fails:
Tried to re-download the model again, but it doesn't help =( The model folder structure: