Open stevehuanghe opened 4 years ago
I also faced the same problem. I try to change the size of RelDN.prd_cls_feats.0.weight
in lib\modeling_rel\reldn_heads.py from (6144, 12288) to (4096, 12288), but I can't get the same evaluation result as the paper. Did you find a solution for the issue?
I also faced the same problem. I try to change the size of
RelDN.prd_cls_feats.0.weight
in lib\modeling_rel\reldn_heads.py from (6144, 12288) to (4096, 12288), but I can't get the same evaluation result as the paper. Did you find a solution for the issue?
Yes, faced the same. I also changed (6144, 12288) to (4096, 12288), and my SGDET results are 16.01 for R@20, 23.32 for R@50 and 29.53 for R@100. That's actually far from paper's results.
@simonJJJ @heygrandpa @stevehuanghe @jz462 Did anyone find a solution to this or is it the fault in the pre-trained model itself?
Hi everyone,
Sorry for the late reply. I've updated the link which contains a compatible VGG16 that gives a results on par with the paper. You can also download it here. Please let me know if it does not work or if you have further questions.
Ji
Hi Ji, Thanks for the great work! The error still exists using updated models. But the ResNeXt model works well.
I evaluate the trained VGG16 model on Sdget task on Visual Genome and followings result:
R@20: 20.74
R@50: 29.36
R@100: 35.95
These results are somewhat different from the result of the paper? Does anyone get the same results?
Another problem occurs when I enable multi-gpu-testing
inference, an error occurs:
AssertionError: Range subprocess failed (exit code: 1)
.
Could you give me a recommendation to solve this problem?
Hi @cao-nv, Yes I confirm that these are the valid reproduced results. A little suggestion of mine: if you want to compare with our method, these results are definitely OK; if you plan to use our method to obtain scene graphs as features for down-stream tasks, you don't have to struggle with the VGG16 backbone. ResNext is clearly better for your need.
About you multi-gpu issue, you need to make sure the value of CUDA_VISIBLE_DEVICES is equal to the actual GPUs you have on your machine, because our code determines the GPUs by only looking at CUDA_VISIBLE_DEVICES.
Ji
Thanks @jz462, For the multi-gpu issue, I share a server with 7 working GPUs with others, so that I often set the number of visible gpus to 2, or 4. Is it ok, or the number of visible GPUs must be 7.
@cao-nv It should be OK if you do export CUDA_VISIBLE_DEVICES=<g1,g2,...> where g1,g2 are the indices of the GPUs you want to use, and you can set any number of these as you want.
I got this annoying error every time the number of visible GPUs is not 1 and multi-gpu-test
is enable.
Perhaps there is a problem with subprocess, the returncode is 1, but expected 0.
I got this annoying error every time the number of visible GPUs is not 1 and
multi-gpu-test
is enable.Perhaps there is a problem with subprocess, the returncode is 1, but expected 0.
Hi, did you solve this problem. I met the same error with you. Any suggestions?
I got this annoying error every time the number of visible GPUs is not 1 and
multi-gpu-test
is enable.Perhaps there is a problem with subprocess, the returncode is 1, but expected 0.
Hi, did you solve this problem. I met the same error with you. Any suggestions?
Unfortunately, I didn't found any solution for the issue, so I just moved to other scene graph generation model
hi Ji,your new trained models in https://drive.google.com/file/d/15w0q3Nuye2ieu_aUNdTS_FNvoVzM4RMF/view use the same detection model with before trained models?
Dear Ji,
I ran into this runtime error when trying to evaluate the model with pertained checkpoints:
python ./tools/test_net_rel.py --dataset vg --cfg configs/vg/e2e_faster_rcnn_VGG16_8_epochs_vg_v3_default_node_contrastive_loss_w_so_p_aware_margin_point2_so_weight_point5_no_spt.yaml --load_ckpt trained_models/vg_VGG16/model_step62722.pth --output_dir Outputs/vg_VGG16 --multi-gpu-testing --do_val
Would you please help me with this issue? Thank you very much.