Open ezhang7423 opened 1 month ago
Hi,
I'm not able to reproduce the problem. I tried running python prover/evaluate.py --data-path data/leandojo_benchmark_4/random/ --gen_ckpt_path kaiyuy/leandojo-lean4-tacgen-byt5-small --split test --num-workers 40 --num-gpus 8
and got the GPU utilization below. It seems all GPUs are running the evaluation.
Hi there! I am trying to run evaluation with multiple GPUs. I have ran everything in './scripts/minimal_example.sh'. Running on 1 GPU works perfectly, however when I pass in more than 1 GPU, all Huggingface Generator workers still only initialize on the first GPU, leaving the rest unutilized. Unfortunately, --use-vlm does not appear to be working either, as an error is thrown that the T5 model is not supported. Any support would be greatly appreciated.