huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
135k stars 27.01k forks source link

Prompt tuning for Dolly-v2-7b model for Question and Answer not supported? #23345

Closed pratikchhapolika closed 1 year ago

pratikchhapolika commented 1 year ago

I am following this page for Prompt tuning for Dolly-v2-7b model for Question and Answer: https://huggingface.co/docs/peft/task_guides/clm-prompt-tuning

Instead of doing the training in old pytorch way. I am doing the training using Trainer api. Also in this link https://huggingface.co/stevhliu/bloomz-560m_PROMPT_TUNING_CAUSAL_LM/tree/main , I see 2 files adapter_config.json and adapter_model.bin.

But when I save the model using Trainer api I do not see any config file. Also model size is bigger than what is shown in above link.

Is this correct way to train, save and load model for Prompt Tuning. ?

The inference take lot of time to generate. and gives some gibberish output

Who can help?

@stevhliu @sgugger @lvwerra

Here is my code: The use-case is:

I have Context which has lot of paragraphs and then Question , the model has to answer the Question based on Context in a professional manner. Also can it classify the Question as relevant if answer is present in Context and irrelevant if answer is not in Context

The code that I have written is:

    task_type=TaskType.CAUSAL_LM,
    prompt_tuning_init=PromptTuningInit.TEXT,
    num_virtual_tokens=30,
    prompt_tuning_init_text="Answer the question as truthfully as possible using and only using the provided context and if the answer is not contained within the context/text, say Irrelevant",
    tokenizer_name_or_path="dolly-v2-7b"
)
tokenizer = AutoTokenizer.from_pretrained("dolly-v2-7b")
model = AutoModelForCausalLM.from_pretrained("dolly-v2-7b",load_in_8bit=True,device_map='auto') #,load_in_8bit=True

model = get_peft_model(model, peft_config)

train_data = [
    {
        "Context": "How to Link Credit Card to ICICI Bank Account Step 1: Login to ICICIBank.com using your existing internet banking credentials. Step 2: Go to the 'Service Request' section. Step 3: Visit the 'Customer Service' option. Step 4: Select the Link Accounts/ Policy option to link your credit card to the existing user ID.",
        "Question": "How to add card?",
        "Answer": "Relevant. To add your card you can follow these steps: Step 1: Login to ICICIBank.com using your existing internet banking credentials. Step 2: Go to the 'Service Request' section. Step 3: Visit the 'Customer Service' option. Step 4: Select the Link Accounts/ Policy option to link your credit card to the existing user ID."
    },
    {
        "Context": "The python programming language is used in many different fields including web development, data analysis, artificial intelligence and scientific computing. It is a high-level language that is easy to learn and has a large community of users who can provide support and advice. ",
        "Question": "What is Python used for?",
        "Answer": "Relevant. Python is used in many different fields including web development, data analysis, artificial intelligence and scientific computing."
    }
]

Define a function to map examples to inputs and targets

def preprocess_function(examples):
    tokenized_examples = tokenizer(
        examples["Question"][0],
        examples["Context"][0],
        truncation=True,
        max_length=1024,
        padding="max_length"
    )
    tokenized_examples['labels']=tokenizer(
        examples["Answer"],
        truncation=True,
        max_length=1024,
        padding="max_length",
        return_tensors="pt")['input_ids'][0]

    return tokenized_examples

tokenized_train_data = [preprocess_function(example) for example in train_data]


class DemoDataset(Dataset):
    def __init__(self, data):
        self.data = data

    def __len__(self):
        return len(self.data)

    def __getitem__(self, idx):
        sample = self.data[idx]

        item = {k: torch.tensor(v) for k, v in sample.items()}
        return item

dataset = DemoDataset(tokenized_train_data)

training_args = TrainingArguments(
    output_dir="results",
    learning_rate=1e-5,
    per_device_train_batch_size=1,
    num_train_epochs=10,
    weight_decay=0.01,
    logging_steps=1,
    save_steps=1,
    logging_dir="logs"
)
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=dataset,
    # data_collator=data_collator,
    tokenizer=tokenizer
)
trainer.train()

Is this correct way to save?

trainer.save_model("dolly3b_demo_model")

Inference

Is this correct way to do inference

from peft import PeftModel, PeftConfig
tokenizer = AutoTokenizer.from_pretrained("dolly-v2-3b")
model = AutoModelForCausalLM.from_pretrained("dolly3b_demo_model")
model = get_peft_model(model, peft_config)

# Define example
context = "How to Link Credit Card to ICICI Bank Account Step 1: Login to ICICIBank.com using your existing internet banking credentials. Step 2: Go to the 'Service Request' section. Step 3: Visit the 'Customer Service' option. Step 4: Select the Link Accounts/ Policy option to link your credit card to the existing user ID."
question = "How to add card?"

# Encode inputs with prompt and tokenize
inputs = [f"{context} {question}"]
inputs_encoded = tokenizer(inputs, padding=True, truncation=True, max_length=1024, return_tensors="pt")
outputs = model.generate(input_ids=inputs_encoded["input_ids"], attention_mask=inputs_encoded["attention_mask"], max_new_tokens=200,)
print(tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tokens=True))
amyeroberts commented 1 year ago

Hi @pratikchhapolika, thanks for raising an issue!

This is a question best placed in our forums. We try to reserve the github issues for feature requests and bug reports.

pratikchhapolika commented 1 year ago

Hi @pratikchhapolika, thanks for raising an issue!

This is a question best placed in our forums. We try to reserve the github issues for feature requests and bug reports.

Hi @amyeroberts , Since I did not get any response in forums so thought to ask here.

amyeroberts commented 1 year ago

@pratikchhapolika I understand, however the github issues are still reserved for feature requests and bugs as it's not sustainable for everyone to ask here if there isn't a response on the forum.

Another place to ask for help on questions such as these are on the discord forum. Specifically, there's an ask-for-help channel which is very active.

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.