Open monsieurpooh opened 2 years ago
I am trying to fix this bug by delving into the code on my end, but I can't even figure out where the code lives. The first line is "from transformers import GPTNeoForCausalLM, GPT2Tokenizer". I can't find where "GPTNeoForCausalLM" is in transformers. When I do a text search on the whole library folder it turns up empty.
I found out how to browse the source code, but I am now confused about how dividing all the scores by a common value will change the ranking or the final result: https://github.com/huggingface/transformers/blob/87e6e4fe5c7e65cb69e70306f22de6daf16b6e14/src/transformers/generation_logits_process.py#L141
This is not really a bug. I found out I had to go way lower, lower than a typical "float" one would expect most programming languages to be able to handle. I specified 0.00000000000001 as the temperature and now the output is pretty consistent.
I would like to reopen this issue because in some situations with long prompts, even 1e-18 is not small enough to create a totally deterministic response, and at such a small number, the script has a chance of throwing an exception due to "probability tensor contains either inf
, nan
or element < 0"
The workaround is disable sampling
If setting the temperature to 0.00001 or similarly low float, the output is noticeably less chaotic than when temperature is a significantly larger numbers; however, the output is still very non-deterministic and often answers questions wrong even when the majority of the time it may have gotten it right. I suspect it would be better to have more freedom over the temperature range, where 0.00001 actually denotes extremely low temperature with almost no variation in output, for better question-answering capability
If anyone knows of a workaround to this please let me know