bene-ges / nemo_compatible

useful things that work with NVIDIA NeMo library
Apache License 2.0
9 stars 1 forks source link

MisconfigurationException during training #16

Closed thomaschhh closed 12 months ago

thomaschhh commented 1 year ago

lightning_fabric.utilities.exceptions.MisconfigurationException: You requested gpu: [1] But your machine only has: [0]

Might this be related your previous commit -> 27bce6d?

https://github.com/NVIDIA/NeMo/blob/08937c8e7dd2e782fac99c6c230ae0170c277700/examples/nlp/spellchecking_asr_customization/run_training.sh#L52

trainer.devices=[0] \ fixes this.

bene-ges commented 1 year ago

Yes, this option just allows to set specific gpu, if you want. It is intended to be changed, depending on your configuration. If you run it on multi-gpu machine you can write, for example trainer.devices=4 and it will use four gpus, same as trainer.devices=[0,1,2,3]