thunlp / VisualDS

MIT License
25 stars 3 forks source link

Related documents missing #3

Closed tao123322 closed 2 years ago

tao123322 commented 3 years ago

When running sh cmds/20/motif/predcls/semi/em_E_step1.sh command,the following three problems occurred:

1、File "/home/tao/anaconda3/envs/scene_graph_benchmark/lib/python3.6/site-packages/torch/distributed/distributed_c10d.py", line 211, in _check_default_pg "Default process group is not initialized" AssertionError: Default process group is not initialized 2、Traceback (most recent call last): File "score.py", line 4, in l = pickle.load(open("raw_em_E.pk", "rb")) FileNotFoundError: [Errno 2] No such file or directory: 'raw_em_E.pk' 3、Traceback (most recent call last): File "cut_off.py", line 8, in score = json.load(open("score.json", "r")) FileNotFoundError: [Errno 2] No such file or directory: 'score.json'

How to solve the first problem, and where to find the files for the second and third problems.

tao123322 commented 3 years ago

I have not used a distributed running program, how should I modify it?

waxnkw commented 3 years ago

Hi, thanks for your problems. For the problem 2 and 3, they are caused by the problem 1. The files mentioned in the problem 2 and 3 will be generated when the program can run successfully. For the problem 1, it seems like something wrong with the distributed running. This might be caused by environment problems or other problems. I think you can first check whether your machine has GPU id 4,5, which I use in the script. If your machine does not have GPU 4,5, you can specify the GPU id you use in the script. Also, to modify the program to a single GPU version, just go to cmds/20/motif/predcls/semi/em_E_step1.sh and modify CUDA_VISIBLE_DEVICES=4,5 to CUDA_VISIBLE_DEVICES=0 and --nproc_per_node=2 to --nproc_per_node=1.