Closed fanq15 closed 2 years ago
Hi,
Thanks for your interest in our work!
You can easily choose your launcher
in the shell file, and run the NOAH.
Thanks for your quick reply!
But I found that even if I use the pytorch
launcher, the code still can not run with the following bug:
srun: error: Unable to resolve "node0": Unknown host
srun: error: Unable to establish control machine address
srun: error: Unable to allocate resources: No error
Hi,
Thank you for the catch.
Thanks!
It seems because there is always a srun
in the commend.
For sure, srun
is only for slurm
If I change the launcher from slurm
to None
, what should I do for the srun
?
If I change the launcher from
slurm
toNone
, what should I do for thesrun
?
Deleting line26-line33 directly.
Thanks!
It works!
Suggest setting the single machine training as the default code.
After all, most researchers are not rich enough to have multiple machines for training and will be unfamiliar with the slurm
code.
Thanks! It works! Suggest setting the single machine training as the default code. After all, most researchers are not rich enough to have multiple machines for training and will be unfamiliar with the
slurm
code.
Nice suggestion!
We will add the single machine training code as you recommend.
Another question.
When I use the none
launcher, the experiments of different datasets are sequentially conducted on one GPU.
Is it possible to apply multiple GPU training using the none
launcher?
The training set is very small and thus it seems we do not need multiple GPU training.
Indeed.
Thanks!
Enjoy NOAH!
Thanks for your great work! It seems that all the experiments require the slurm on multiple machines, right?