Closed karolzak closed 6 years ago
Hi, while trying to setup a distributed learning job with TF I found out that AZ_BATCHAI_NUM_GPUS variable (described here) doesnt exist and it returns an empty string:
AZ_BATCHAI_NUM_GPUS
Part of my job.json config:
job.json
... "masterCommandLineArgs": "--job_name=worker --num_gpus=$AZ_BATCHAI_NUM_GPUS ...
Error msg:
error: argument --num_gpus: invalid int value: ''
Either support that variable or fix the docs 😀 Thanks!
CC: @AlexanderYukhanov
thank you for reporting the issue, working on the fix
Hi, while trying to setup a distributed learning job with TF I found out that
AZ_BATCHAI_NUM_GPUS
variable (described here) doesnt exist and it returns an empty string:Part of my
job.json
config:Error msg:
Either support that variable or fix the docs 😀 Thanks!
CC: @AlexanderYukhanov