When running the "replicate.sh", there will be problems. When running "python tools/train_net.py --num-gpus 4 --dist-url='tcp://127.0.0.1:52133' --config-file ./configs/OWOD/t1/t1_val.yaml SOLVER. IMS_PER_BATCH 4 SOLVER.BASE_LR 0.01 OWOD.TEMPERATURE 1.5 OUTPUT_DIR "./output/t1_final" ", this problem can also occur. The question is:
And then executing "nvidia-smi", the following information is displayed:
"Unable to determine the device handle for GPU 0000:02:00.0: GPU is lost. Reboot the system to recover this GPU"
Looking forward your responding! Thanks very much!
When running the "replicate.sh", there will be problems. When running "python tools/train_net.py --num-gpus 4 --dist-url='tcp://127.0.0.1:52133' --config-file ./configs/OWOD/t1/t1_val.yaml SOLVER. IMS_PER_BATCH 4 SOLVER.BASE_LR 0.01 OWOD.TEMPERATURE 1.5 OUTPUT_DIR "./output/t1_final" ", this problem can also occur. The question is:
And then executing "nvidia-smi", the following information is displayed: "Unable to determine the device handle for GPU 0000:02:00.0: GPU is lost. Reboot the system to recover this GPU"
Looking forward your responding! Thanks very much!