Loading my Google Flan T5 finetuned model on Question Answering from my hugging face account

import torch from peft import PeftModel, PeftConfig from transformers import AutoModelForCausalLM, AutoTokenizer, AutoModelForSeq2SeqLM

peft_model_id = "harshs21/google-flan-t5-base" config = PeftConfig.from_pretrained(peft_model_id) model1 = AutoModelWithLMHead.from_pretrained(config.base_model_name_or_path) # load_in_8bit=True, tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)

Load the Lora model

model1 = PeftModel.from_pretrained(model1, peft_model_id)

But while loading the agent with the following code env = MyRLEnv(model1, tokenizer, observation_input=observaton_list) #, output_list = output_list actor = TextRLActor(env, model1, tokenizer)
agent = actor.agent_ppo(update_interval=5, minibatch_size=256, epochs=20)

It is giving me the following error,

/usr/local/lib/python3.10/dist-packages/ipykernel/ipkernel.py:283: DeprecationWarning: should_run_async will not call transform_cell automatically in the future. Please pass the result to transformed_cell argument and any exception that happen during thetransform in preprocessing_exc_tuple in IPython 7.17 and above. and should_run_async(code) ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ in <cell line: 2>:2 │ │ │ │ /usr/local/lib/python3.10/dist-packages/textrl/actor.py:62 in init │ │ │ │ 59 │ │ elif 'encoder' in parents: # t5 │ │ 60 │ │ │ transformers_model = model.encoder │ │ 61 │ │ else: │ │ ❱ 62 │ │ │ raise ValueError('model not supported') │ │ 63 │ │ │ │ 64 │ │ if unfreeze_layer_from_past > 0: │ │ 65 │ │ │ self.middle_model = HFModelListModule(list(transformers_model.children()) │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ ValueError: model not supported

The above same case is happening when I am loading my finetuned eleutherai/pythia-1.3B model from my hugging face profile. Please someone tell me how to make finetuned model train with RLHF policy

voidful / TextRL

Reward policy agent environment is not training with Finetuned model #23

Load the Lora model

It is giving me the following error,