aws-neuron / aws-neuron-samples

Example code for AWS Neuron SDK developers building inference and training applications
Other
101 stars 32 forks source link

Shell script variables confuse devices and cores #43

Closed ajayvohra2005 closed 9 months ago

ajayvohra2005 commented 9 months ago

Samples shell scripts' variables confuse devices and cores. A Trn1 instances has 16 Neuron Devices (chips), each with 2 cores.

This sample script, on Line 31 shows:

export NEURON_NUM_DEVICES=32

I think, the correct code would be:

export NEURON_NUM_CORES=32

The deprecated Neuron Megatron example script shows it correctly:

NUM_NEURONCORES=32

aws-rhsoln commented 9 months ago

Thank you for reporting the issue. Apologies for the confusion with the variable. We will change the variable name to NUM_NEURONCORES=32. However, one thing to note, that variable is used only in that shell script and can be named to anything.