Open milkisbad opened 3 years ago
Is this similar to #441 ?
No in the sense that behaviour is different (no mention of replicas), yes in the sense that optuna will not work with a trainer that is accelerated with DDP. I have also tried the example from that issue and there I was able to run multigpu on 'ddp-spawn', but no 'ddp' and I can't run TFT with 'ddp-spawn' at all.
Expected behavior
I executed code
in order to run hyperparameter optimization on multiple gpus.
Actual behavior
However, result was
the weird part is that the trainer itself with 'ddp' and 'gpus=2' works fine, so does optimize_hyperparameters (from code example) with the same code, but with 'gpus=1' instead of 2. One thing to note is that p in replicas[0][p] and a and b in sizes [a,b] do change between runs. I do think the reason should be with optuna/pytorch_lightning, but still trying to look for help here ;)
Code to reproduce the problem
the code is from tft tutorial, however i added 'CUDA VISIBLE DEVICES' and changed training_kwargs in optimize hyperparameters.