AMontgomerie / question_generator

An NLP system for generating reading comprehension questions
MIT License
281 stars 72 forks source link

</s> special tokens are coming in answers #6

Closed Threepointone4 closed 2 years ago

Threepointone4 commented 3 years ago

The special character is coming as part of the answer. One warring i saw which may be related to this : UserWarning: This sequence already has . In future versions, this behavior may lead to duplicated eos tokens being added. f"This sequence already has {self.eos_token}. In future versions, this behavior may lead to duplicated eos tokens being added."

My version : transformers==4.1.1

ankitkr3 commented 3 years ago

Hi @Threepointone4 @AMontgomerie I am also facing the same problem, is it solved by any chance ?

Punit-Koujalgi commented 3 years ago

Don't append at the end to the input, the library does that for you.

Punit-Koujalgi commented 3 years ago

I meant don't append ""

Punit-Koujalgi commented 3 years ago

dont append \<\/s>

ps-innovator commented 3 years ago

Hello, this is the following code I used to generate questions. I also have the same problem as @Threepointone4. As shown in my code, I am not appending anything to the input @Punit-Koujalgi.

Here is my code:

qg = QuestionGenerator()

with open('owl_rescue.txt', 'r') as a: article = a.read()

qa_list = qg.generate( article, num_questions=10, answer_style='all' )

print_qa(qa_list)

@AMontgomerie is there any solution for this?

Thanks in advance! I greatly appreciate it!

Threepointone4 commented 3 years ago

@all sorry for the late response, Just upgrade to new transformer this will not happen and self.qg_tokenizer.decode(output[0], skip_special_tokens=True) this skip_special_token should be true in newer transformer

AMontgomerie commented 2 years ago

I've added skip_special_tokens=True to master now.