Closed pchampio closed 3 years ago
BTW, I've set nvidia-smi -c 3
.
Is this correct ?
Oh, it's because of the configs/tdnnf_e2e
that indicates:
num_jobs_initial = 2
num_jobs_final = 5
(but again, I'll like to know if nvidia-smi -c 3
is correct)
Should I expect similar results even-through I'm running on la node with less GPUs ?
Is there a way to 'fake' been on multiples GPUs ? (a wait mode ?)
Hi,
May be you can just set num_jobs_final to 2? That way the training will never use more GPUs than the number present?
I don't use nvidia-smi -c 3
, but that I think depends on the hardware setup.
Srikanth
Okay! Thanks for the information.
Upon training, pkwrap fails at around the 25 iter. The node I'm training on has 2 GTX 1080Ti. Do I need to use a node with more GPUs?