Open AndreyMoskalev565 opened 8 months ago
I had the same problem and I think I got it to work by specifying the exact path to the model (something like C:/Users/User/.cache/huggingface/hub/models--microsoft--codereviewer/snapshots/094a...
) for the --model_name_or_path
argument. However, I did run into another problem immediately afterwards, which I have not solved yet.
False alarm. It turned out that the hang occurs only during debugging :)
Hi! I'm trying to reproduce the fine tuning of CodeReviewer using a script finetune-ref.sh on Windows 11. However, when executing the multiprocessing.Pool(...).map function or iterating over torch.utils.data.DataLoader script freezes hopelessly.
It is important to note that in order to solve other problems, I have made the following changes to the code:
finetune-ref.sh:
"python -m torch.distributed.launch ..." replaced by "torchrun..."
run_finetune_ref.py:
"nccl" replaced by "gloo"
Could you help solve this problem?