Open Red-Giuliano opened 1 year ago
Hi, can you tell me your method to use a GPT model I trained using ColossalAI with huggingface/transformers? Pointing out which example is your implementation reference would be helpful.
Hi feifeibear,
Thanks so much for your reply. The code I've used to train the model is adapted is from the /language/gpt/ example. I created a smaller version of the gpt2_vanilla configuration because my task did not require a model quite that large.
Now I have the model.pt file that I saved. When I try to load it using the transformers library I run into problems though (this makes sense since the GPT model is imported from the titans module, and not transformers). I'd love to use this model using the huggingface/transformers library so that I can take advantage of the functionality within that ecosystem.
From the research I've done it seems that the transformers library is expecting a model file with specific keys for each layer so I'm working on seeing if there is any way to resolve the discrepancy there. I know that the library is supported at some level because of this blog post:
But would love some more advice for my use case. Thanks so much once again for your time and help!
Describe the feature
Hi all,
I'm trying to use a GPT model I trained using ColossalAI with huggingface/transformers for inference but it's not possible to load the model as a huggingface model as it is implemented in pytorch. How can I go about loading the model I trained using huggingface/transformers library?
Thanks so much for your help.
Best, Red