Closed ufownl closed 2 months ago
I've tried the 9b-it variant of Gemma 2 and found that there is an extra <end_of_turn>\n in the output as below:
9b-it
<end_of_turn>\n
__ _ ___ _ __ ___ _ __ ___ __ _ ___ _ __ _ __ / _` |/ _ \ '_ ` _ \| '_ ` _ \ / _` | / __| '_ \| '_ \ | (_| | __/ | | | | | | | | | | (_| || (__| |_) | |_) | \__, |\___|_| |_| |_|_| |_| |_|\__,_(_)___| .__/| .__/ __/ | | | | | |___/ |_| |_| tokenizer : tokenizer.spm weights : 9b-it-sfp.sbs compressed_weights : [no path specified] model : 9b-it weight_type : sfp max_tokens : 3072 max_generated_tokens : 2048 multiturn : 1 *Usage* Enter an instruction and press enter (%C resets conversation, %Q quits). *Examples* - Write an email to grandma thanking her for the cookies. - What are some historical attractions to visit around Massachusetts? - Compute the nth fibonacci number in javascript. - Write a standup comedy bit about GPU programming. > Hey man! [ Reading prompt ] ............ Hey! What's up? 😊<end_of_turn> >
Should I check this token during the output process and abort the generation?
The tech report mentions this is new expected behavior for the model. It seems reasonable to filter this out, yes :)
Closing as expected behavior :)
I've tried the
9b-it
variant of Gemma 2 and found that there is an extra<end_of_turn>\n
in the output as below:Should I check this token during the output process and abort the generation?