add_special_tokens should be False.

sieu-n commented 1 year ago

I got different results with TextGenerationPipeline (basaran's were much worse). I tweaked hyperparameters that were different, but had minor impact.

Digging deeper, I checked what tokenized ids were passed to the model and noticed that a 0 is assigned as the first element when I use basaran. This doesn't appear when wrapping instead with transformers.TextGenerationPipeline. This seems to be because add_special_tokens defaults to True.

sample input: hello, for llama-7b.

basaran: hello hello hello hello ...
hf: hello, and I'm glad you're here. I'm a writer, a ...

peakji commented 1 year ago

Hi @sieu-n , LLaMA is not yet officially supported by Basaran, basically we are waiting for the next release of HF Transformers, see https://github.com/hyperonym/basaran/issues/57 for details.

We will investigate the reported issue, but changing add_special_tokens might not be appropriate since some other models relies on it to be true.

peakji commented 1 year ago

Basaran v0.15.3 now officially supports LLaMA: https://github.com/hyperonym/basaran/issues/57#issuecomment-1509036822

hyperonym / basaran

add_special_tokens should be False. #121