turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs
MIT License
3.45k stars 257 forks source link

Error When Converting Safetensor to exl2 #152

Closed Noobville1345 closed 8 months ago

Noobville1345 commented 10 months ago

Hi so I've been working on converting Euryale 1.4 to 5.15 bpw, and I was on about layer 79 when it errored on me. This is my first time doing it so I appreciate any help.

For reference I was using https://huggingface.co/datasets/EleutherAI/the_pile_deduplicated/tree/refs%2Fconvert%2Fparquet/default/train 0000.parquet.

I don't quite understand why it would break in the middle of the conversion and not the end but here's the terminal log:


 -- Last 50 tokens of dataset:
    "ide.\n\nTo reach Bude's other beaches requires either a car or a hike along the coast path. Three miles south of town is **Widemouth Bay** (pronounced 'widmouth'), a broad, sand"
 -- Token embeddings again...
Traceback (most recent call last):
  File "E:\Kobolds\AI\exllamav2\convert.py", line 273, in <module>
    embeddings(job, save_job, model)
  File "E:\Kobolds\AI\exllamav2\conversion\quantize.py", line 49, in embeddings
    save_file(embeddings_dict, os.path.join(job["out_dir"], "input_states.safetensors"))
  File "c:\python311\Lib\site-packages\safetensors\torch.py", line 281, in save_file
    serialize_file(_flatten(tensors), filename, metadata=metadata)
                   ^^^^^^^^^^^^^^^^^
  File "c:\python311\Lib\site-packages\safetensors\torch.py", line 475, in _flatten
    return {
           ^
  File "c:\python311\Lib\site-packages\safetensors\torch.py", line 479, in <dictcomp>
    "data": _tobytes(v, k),
            ^^^^^^^^^^^^^^
  File "c:\python311\Lib\site-packages\safetensors\torch.py", line 421, in _tobytes
    data = np.ctypeslib.as_array(newptr, (total_bytes,))  # no internal copy
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\*\AppData\Roaming\Python\Python311\site-packages\numpy\ctypeslib.py", line 521, in as_array
    p_arr_type = ctypes.POINTER(_ctype_ndarray(obj._type_, shape))
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\*\AppData\Roaming\Python\Python311\site-packages\numpy\ctypeslib.py", line 354, in _ctype_ndarray
    element_type = dim * element_type
                   ~~~~^~~~~~~~~~~~~~
ValueError: Array length must be >= 0, not -1879048192```
fgdfgfthgr-fox commented 9 months ago

Similar issue here, which I was using an 4090 to convert yi-34b to exl2. Same error, but happened at the lm_head (linear) layer.

fgdfgfthgr-fox commented 9 months ago

I solved it by redoing the quant (delete everything in the output folder) without specifying the -l and -ml argument.

turboderp commented 8 months ago

(The quant procedure has changed a lot lately so this is probably not relevant anymore.)