Describe the bug
After following the red panda fine tuning tutorial, running the bot inference script with the output model results in an error.
$python ./inference/bot.py --model=model_ckpts/hf/
Loading model_ckpts/hf/ to cuda:0...
Welcome to OpenChatKit shell. Type /help or /? to list commands.
>>> who is allen turing?
Traceback (most recent call last):
File "/home/ubuntu/OpenChatKit/./inference/bot.py", line 285, in <module>
main()
File "/home/ubuntu/OpenChatKit/./inference/bot.py", line 281, in main
).cmdloop()
File "/home/ubuntu/miniconda3/envs/OpenChatKit/lib/python3.10/cmd.py", line 138, in cmdloop
stop = self.onecmd(line)
File "/home/ubuntu/miniconda3/envs/OpenChatKit/lib/python3.10/cmd.py", line 217, in onecmd
return func(arg)
File "/home/ubuntu/OpenChatKit/./inference/bot.py", line 150, in do_say
output = self._model.do_inference(
File "/home/ubuntu/OpenChatKit/./inference/bot.py", line 92, in do_inference
outputs = self._model.generate(
File "/home/ubuntu/miniconda3/envs/OpenChatKit/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home/ubuntu/miniconda3/envs/OpenChatKit/lib/python3.10/site-packages/transformers/generation_utils.py", line 1326, in generate
return self.sample(
File "/home/ubuntu/miniconda3/envs/OpenChatKit/lib/python3.10/site-packages/transformers/generation_utils.py", line 1981, in sample
next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0
Describe the bug After following the red panda fine tuning tutorial, running the bot inference script with the output model results in an error.
To Reproduce Steps to reproduce the behavior:
python ./inference/bot.py --model=model_ckpts/hf/
Expected behavior Inference to run properly.
Environment: The code is running on a lambdalabs 8xA100 40GB SMX4