Closed longerzone closed 1 year ago
Hi @longerzone , it looks like you're running the older ncf benchmark. We're on DLRM now (and are actually working on a new version of that as well). Could you try TOT DLRM instead? Feel free to open another issue if you have problems with DLRM. Thanks.
as the title, I'm running recommendation scene of training, here is my device info in the docker bash:
But I observed that only one GPU is in the working state in the DLRM training stage( only GPU0 is working in the whole training stage):![image](https://user-images.githubusercontent.com/7113279/174007967-77595cde-7c1e-4583-833e-eac0dc81eb83.png)
and I found no any parameter for gpu number control in the run_and_time.sh or ncf.py script, have any suggestion for this?