lightonai / RITA

RITA is a family of autoregressive protein models, developed by LightOn in collaboration with the OATML group at Oxford and the Debora Marks Lab at Harvard.
MIT License
90 stars 8 forks source link

Unable to run Example Script #3

Closed zanussbaum closed 2 years ago

zanussbaum commented 2 years ago

Unable to run a similar script to the example

>>> from transformers import pipeline
>>> from transformers import AutoModel, AutoModelForCausalLM, AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("lightonai/RITA_s")
>>> model = AutoModelForCausalLM.from_pretrained("lightonai/RITA_s", trust_remote_code=True)
Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
>>> rita_gen = pipeline('text-generation', model=model, tokenizer=tokenizer)
The model 'RITAModelForCausalLM' is not supported for text-generation. Supported models are ['XGLMForCausalLM', 'PLBartForCausalLM', 'QDQBertLMHeadModel', 'TrOCRForCausalLM', 'GPTJForCausalLM', 'RemBertForCausalLM', 'RoFormerForCausalLM', 'BigBirdPegasusForCausalLM', 'GPTNeoForCausalLM', 'BigBirdForCausalLM', 'CamembertForCausalLM', 'XLMRobertaXLForCausalLM', 'XLMRobertaForCausalLM', 'RobertaForCausalLM', 'BertLMHeadModel', 'OpenAIGPTLMHeadModel', 'GPT2LMHeadModel', 'TransfoXLLMHeadModel', 'XLNetLMHeadModel', 'XLMWithLMHeadModel', 'ElectraForCausalLM', 'CTRLLMHeadModel', 'ReformerModelWithLMHead', 'BertGenerationDecoder', 'XLMProphetNetForCausalLM', 'ProphetNetForCausalLM', 'BartForCausalLM', 'OPTForCausalLM', 'MBartForCausalLM', 'PegasusForCausalLM', 'MarianForCausalLM', 'BlenderbotForCausalLM', 'BlenderbotSmallForCausalLM', 'MegatronBertForCausalLM', 'Speech2Text2ForCausalLM', 'Data2VecTextForCausalLM'].

This is with Python3.8 and HF Transformers tokenizers-0.12.1 transformers-4.19.2

Additionally, are there more details about your prompt tuning? Curious to know how you approached it and what prompt engineering looks like for proteins as opposed to language.

DanielHesslow commented 2 years ago

The warning can be ignored. It's just HF not being aware of our models supporting the text-generation pipeline. Text generation still goes ahead without issue: https://colab.research.google.com/drive/1IGbolGIpafvp0vA7qbP2BvnAEIkSCDDG?usp=sharing

Though I agree that getting error messages when everything is in order is quite confusing, I'll ask the HF folks if there's a workaround.

Regarding prompt tuning we followed the same process as we do typically in NLP. Just take whatever dataset you want to finetune on (for us just a specific protein family), freeze all the model weights except for the embeddings of the prompt tuning tokens and then train normally. For our exps we used k=10 for the number of prompt tuning tokens. Unfortunately we do this in our codebase and I won't have the time to port this over to HF, tho I'm sure there's some nice tutorial with HF somewhere. One of the tricks to get PT working is that you need a large learning rate, typically a couple of order of magnitude higher than during normal training. But we didn't do anything special for proteins, just the normal process.

pzhang84 commented 1 year ago

Hi @DanielHesslow, I got a KeyError: 'rita' message when I run the same code as @zanussbaum did. Could you please help with it? Thanks