LLukas22 / llm-rs-python

Unofficial python bindings for the rust llm library. šŸā¤ļøšŸ¦€
MIT License
71 stars 4 forks source link

Will you please guide how to run the conversion script? #36

Open AayushSameerShah opened 10 months ago

AayushSameerShah commented 10 months ago

Hie šŸ‘‹šŸ» Coming from this GGML conversion script and the issue that you commented in https://github.com/ggerganov/ggml/issues/280#issuecomment-1606901828

Now...

I have a fine-tuned model found on huggingface called "NumbersStation/nsql-350M" which basically is a CodeGen model so I will need to convert into GPT-J model so that I can convert into GGML.

Conversion in GPT-J

I have used this script: https://gist.github.com/moyix/7896575befbe1b99162ccfec8d135566 To convert the CodeGen model into GPT-J which worked correctly.

Then...

After converting into GPT-J, I tried to use the GGML script but was getting:

KeyError                                  Traceback (most recent call last)
Cell In[42], line 2
      1 for key in encoder:
----> 2     text = bytearray([byte_decoder[c] for c in key])
      3     fout.write(struct.pack("i", len(text)))
      4     fout.write(text)

Cell In[42], line 2, in <listcomp>(.0)
      1 for key in encoder:
----> 2     text = bytearray([byte_decoder[c] for c in key])
      3     fout.write(struct.pack("i", len(text)))
      4     fout.write(text)

KeyError: '\t'

So, I came to this repo

Where I found this: https://github.com/LLukas22/llm-rs-python/blob/main/llm_rs/convert/models/gptj.py

But... can't figureout how to convert my GPT-J model to GGML

Will you please help @LLukas22 ? Thanks šŸ™šŸ»

LLukas22 commented 10 months ago

Im pretty sure that either your model isn't a fully compatible GPT-J model or there are differenzes in the tokenizer. Have you tried to load you GPT-J converted model with the GPT-J implementation of transformers and did it work? Creating conversion scripts is actually very easy as you just have to map the tensor names to the tensor names ggml expectes. Maybe upload the GPT-J converted model.