Open singularity-s0 opened 2 weeks ago
This seems to be related to multiple GPUs. The issue doesn't exist if only 1 GPU is set in CUDA_VISIBLE_DEVICES
.
This seems like a potential issue with the logs rather than with torch.compile
cc @MekkCyber, if you have the banwidth, could you take a look at this?
hi @singularity-s0, you just need to launch the script using accelerate launch script.py
because it's a multi gpu setting
OK Thanks. Would you be kind enough to point out where torch.compile
is called in the code? So that I can better analyze the logic.
System Info
Who can help?
@muellerzr @SunMa
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Using the following test script:
When running
test_torch_compile()
, there will be many lines of logs showing the compilation process of torch, and in the end, there will be a summary like this:When running
test_torch_compile_hf_trainer()
, however, there will be no log related to torch dynamo at all. The summary in the end will also be empty:This indicates that the model is not being compiled at all.
Expected behavior
Setting
torch_compile=True
inTrainingArguments
should makeTrainer
compile the model properly.