clp-research / clembench

A Framework for the Systematic Evaluation of Chat-Optimized Language Models as Conversational Agents and an Extensible Benchmark
MIT License
22 stars 31 forks source link

Add pad_token_id handling to prevent excessive warnings #59

Closed Gnurro closed 7 months ago

Gnurro commented 7 months ago

Some HF models do not have a pad token ID set, and transformers warns on every generation that it sets the tokenizer's EOS token as pad token - this fix introduces doing this preemptively to prevent the warnings. Also: Add docstring for load_config_and_tokenizer()

phisad commented 7 months ago

@sherzod-hakimov could you check this? I am not really into this.