Non-determinism in evaluate.predict when using vllm

Conversation in slack: https://allenai.slack.com/archives/C06GS4HAWJV/p1712773278326539

Similar issue: https://github.com/vllm-project/vllm/issues/966

We saw non-deterministic model responses when doing greedy decoding using evaluate/predict.py, and disabling vllm fixed this issue. Two potential fixes: upcast models to float32, and upgrade vllm to a later version

allenai / open-instruct

Non-determinism in evaluate.predict when using vllm #145