huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
133.92k stars 26.78k forks source link

force_words_ids not working #17025

Closed ZonglinY closed 2 years ago

ZonglinY commented 2 years ago

System Info

No inception occurs.

Who can help?

No response

Information

Tasks

Reproduction

from transformers import (GPT2LMHeadModel, GPT2Tokenizer, GPT2Config)

m = GPT2LMHeadModel.from_pretrained('gpt2') t = GPT2Tokenizer.from_pretrained('gpt2')

prompt = "I drink cocacola." input = t(prompt, return_tensors="pt")

bad_words = t("alcohol", add_prefix_space=True, add_special_tokens=False).input_ids force_words = t("very sweet", add_prefix_space=True, add_special_tokens=False).input_ids print("bad_words: ", bad_words) print("force_words: ", force_words)

gen = m.generate(**input, do_sample=True, temperature=0.9, num_beams = 10, top_p=1.0, bad_words_ids = [bad_words], force_words_ids=[force_words], max_length=100)

gen = t.batch_decode(gen) if_exist_very = 'very' in gen if_exist_sweet = 'sweet' in gen print("gen: ", gen) print("if_exist_very: ", if_exist_very) print("if_exist_sweet: ", if_exist_sweet)

Expected behavior

Hi,

I tried to use generate() with force_words_ids. But it does not work. bad_words_ids seems to work though.

Here are the outputs:
gen:  ["I drink cocacola. I don't drink coca. I don't drink coca. I don't drink coca. I don't drink coca. I don't drink coca. I don't drink coca. I don't drink coca. I don't drink coca. I don't drink coca. I don't drink coca. I don't drink coca. I don't drink coca. I don't drink coca. I don't"]
if_exist_very:  False
if_exist_sweet:  False
sijunhe commented 2 years ago

I suspect this is a version issue. The constrained beam search wasn't introduced until 4.17 so if you are using an older version, that might be why it didn't work. Your code worked for me on 4.18 but not on 4.15.

ZonglinY commented 2 years ago

Thanks @sijunhe! I change the version to 4.18 and it works. In addition to use force_word_id to make sure the generation contains some specific words, I'd like the forced words shown in one sentence in a generation, at best in a specified order. Would you be so kind to give me some advice on whether there's parameter in generate() function that can help me do this? Or I have to modify generate() function from its source code? Thanks!

sijunhe commented 2 years ago

I'd like the forced words shown in one sentence in a generation

I think this is the current behavior. As long as you are not using the Disjunctive Constraints, all the input_ids listed in forced_word_id should show up in the generation.

at best in a specified order

I don't think the current generation() supports this yet. However, it is mentioned in the blog post that I lined above as future work, something like a OrderedConstraint that would inherit from the Constraint class.

github-actions[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

zhenduow commented 2 years ago

Hi,

This could be a problem or not based on how Phrasalconstraints is implemented. I am using transformer==4.18. I observe that the forced words do not always appear in my generations. My guess is that the chance of having forced words in generation is limited by num_beams, as I find higher num_beams gives me more generations with forced words.

I also notice that if a forced word is present in the prompt (or starting text), then basically it will not be forced to be generated again? Is that right?

Can you please provide some insights?