only 1 GPU working while training of "Recommandation"

as the title, I'm running recommendation scene of training, here is my device info in the docker bash:

host:/workspace/recommendation# python
Python 3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:22:34)
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> torch
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'torch' is not defined
>>> import torch
>>> torch.cuda.is_available()
True
>>> torch.cuda.device_count()
8
>>> torch.cuda.get_device_name(0)
'Tesla V100-SXM2-32GB'
>>>

But I observed that only one GPU is in the working state in the DLRM training stage( only GPU0 is working in the whole training stage):

and I found no any parameter for gpu number control in the run_and_time.sh or ncf.py script, have any suggestion for this?

mlcommons / training

only 1 GPU working while training of "Recommandation" #567