turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs
MIT License
3.53k stars 272 forks source link

MemoryError python convert.py #394

Closed kisimoff closed 6 months ago

kisimoff commented 6 months ago

System: ubuntu 22, rtx 3090, 64gb Ram.

Tried with both CUDA 12/11 and on global shell/conda env. Same result. Requirements are installed. Also tried with different models.

python convert.py     -i /home/vincent/Develop/exllamav2/models/NeuralExperiment-7b-MagicCoder-v7.5//     -o /home/vincent/Develop/exllamav2/tempQuantization/     -cf /home/vincent/Develop/exllamav2/quantizedModel/     -b 5.0

Result:

Traceback (most recent call last):
  File "convert.py", line 65, in <module>
    config.prepare()
  File "/home/vincent/Develop/exllamav2/exllamav2/config.py", line 219, in prepare
    f = STFile.open(st_file, fast = self.fasttensors, keymap = self.arch.keymap)
  File "/home/vincent/Develop/exllamav2/exllamav2/fasttensors.py", line 114, in open
    return STFile(filename, fast, keymap)
  File "/home/vincent/Develop/exllamav2/exllamav2/fasttensors.py", line 67, in __init__
    self.read_dict()
  File "/home/vincent/Develop/exllamav2/exllamav2/fasttensors.py", line 127, in read_dict
    header_json = fp.read(header_size)
MemoryError
turboderp commented 6 months ago

This usually happens if the .safetensors file is corrupt. Can you verify that you downloaded it correctly?

kisimoff commented 6 months ago

Ah, you are correct, git lfs wasn't properly installed.