NolanoOrg / cformers

SoTA Transformers with C-backend for fast inference on your CPU.
MIT License
311 stars 29 forks source link

added gpt2 #8

Closed kamalojasv181 closed 1 year ago

kamalojasv181 commented 1 year ago

@Ayushk4

kamalojasv181 commented 1 year ago

All changes done @Ayushk4

kamalojasv181 commented 1 year ago

I used the same script as GPT j. I was operating under the assumption that once the model is ported to ggml format, it should be inferable using one unified script. Now that I checked the main.cpp file there is a lot of hard code for each model. I will look into that and see how I can adapt it to gpt2. Thanks

kamalojasv181 commented 1 year ago

@Ayushk4 is the quantization file correct? It does run and return a model. I just took the gptj quantize file and updated the hyperparameters. Is there anything else required?

Ayushk4 commented 1 year ago

@kamalojasv181 I have been getting notifications/mail that you commented on this PR, but I am unable to see those comments here.

You could send those again?

kamalojasv181 commented 1 year ago

Hey @Ayushk4 I think it is done. I also resolved all the merge conflicts. Please give me your thoughts before we can wrap this up. Thanks