About reproducting the result reported in paper

zxk19981227 commented 1 year ago

I tried to reproduce the result of HRNET-W32 with the config: FCSGG_HRNet_W32_2xDownRAF_512x512_MS.yaml However, the result of detection is 17.6, far from the 21.6 reported in the paper. Is there any trick in training? And this is the result for relation detection. SGG eval: R @ 20: 0.0362; R @ 50: 0.0528; R @ 100: 0.0650; for mode=sgdet, type=Recall(Main). SGG eval: ng-R @ 20: 0.0502; ng-R @ 50: 0.0869; ng-R @ 100: 0.1241; for mode=sgdet, type=No Graph Constraint Recall(Main). SGG eval: zR @ 20: 0.0071; zR @ 50: 0.0096; zR @ 100: 0.0163; for mode=sgdet, type=Zero Shot Recall. SGG eval: ng-zR @ 20: 0.0075; ng-zR @ 50: 0.0142; ng-zR @ 100: 0.0235; for mode=sgdet, type=No Graph Constraint Zero Shot Recall. SGG eval: mR @ 20: 0.0176; mR @ 50: 0.0272; mR @ 100: 0.0343; for mode=sgdet, type=Mean Recall.

zxk19981227 commented 1 year ago

Also i tried the second yaml but the training seems can't simulation

zxk19981227 commented 1 year ago

I tried to reproduce the results with little change. I only update the torch vision to 1.7 due to the limitation of cuda 11 and reduce the batch size to 24 due to the limitation of gpu resources. Does these two changes influence the results greatly?

liuhengyue commented 1 year ago

You could try pretraining on object detection first, then only training the raf head while fixing the backbone and object detection heads. This gives you larger batch size.

zxk19981227 commented 1 year ago

Can the batch size improve the performance greatly? Or could you provide me with a docker can fully reproduce the results reported in the paper with cuda11? I have tried with larger batch size but no significant improvement is provided.

获取 Outlook for iOShttps://aka.ms/o0ukef

发件人: HENRY LIU @.> 发送时间: Wednesday, February 15, 2023 8:09:24 AM 收件人: liuhengyue/fcsgg @.> 抄送: Jim Zhou @.>; Author @.> 主题: Re: [liuhengyue/fcsgg] About reproducting the result reported in paper (Issue #12)

You could try pretraining on object detection first, then only training the raf head while fixing the backbone and object detection heads. This gives you larger batch size.

― Reply to this email directly, view it on GitHubhttps://github.com/liuhengyue/fcsgg/issues/12#issuecomment-1430560718, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJTMZW4XUIYSREZX3QVF5G3WXQNDJANCNFSM6AAAAAAT4M4VPY. You are receiving this because you authored the thread.Message ID: @.***>

liuhengyue commented 1 year ago

Batch size matters from my experience. Well, I do not have any docker image on this. Cuda seems not a problem if you use anaconda or docker. The other thing you could try is change the number of object detection kept for raf integral. By default, it should be 100, if you use less, you may get better results.

zxk19981227 commented 1 year ago

Than you I will try these methods immediately

获取 Outlook for iOShttps://aka.ms/o0ukef

发件人: HENRY LIU @.> 发送时间: Wednesday, February 15, 2023 8:19:34 AM 收件人: liuhengyue/fcsgg @.> 抄送: Jim Zhou @.>; Author @.> 主题: Re: [liuhengyue/fcsgg] About reproducting the result reported in paper (Issue #12)

Batch size matters from my experience. Well, I do not have any docker image on this. Cuda seems not a problem if you use anaconda or docker. The other thing you could try is change the number of object detection kept for raf integral. By default, it should be 100, if you use less, you may get better results.

― Reply to this email directly, view it on GitHubhttps://github.com/liuhengyue/fcsgg/issues/12#issuecomment-1430568448, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJTMZW67X4J4VSSBYLQKFNDWXQOJNANCNFSM6AAAAAAT4M4VPY. You are receiving this because you authored the thread.Message ID: @.***>

zxk19981227 commented 1 year ago

I have tried the model with only modification with cuda with config configs/FCSGG_HRNet_W32_2xDownRAF_512x512_MS.yaml. Yet the results are still not satisfactory. As the only gpu i can use is 3090 and a6000, i'm not able to use cuda whose version is below 11.0. So the major influence could be the detection number? Could you tell me that if the training with configs/FCSGG_HRNet_W32_2xDownRAF_512x512_MS.yaml could fully implement your results?

周旭鲲 @.***> 于2023年2月15日周三 08:24写道：

Than you I will try these methods immediately

获取 Outlook for iOS https://aka.ms/o0ukef

发件人: HENRY LIU @.> 发送时间: Wednesday, February 15, 2023 8:19:34 AM 收件人: liuhengyue/fcsgg @.> 抄送: Jim Zhou @.>; Author @.> 主题: Re: [liuhengyue/fcsgg] About reproducting the result reported in paper (Issue #12)

Batch size matters from my experience. Well, I do not have any docker image on this. Cuda seems not a problem if you use anaconda or docker. The other thing you could try is change the number of object detection kept for raf integral. By default, it should be 100, if you use less, you may get better results.

— Reply to this email directly, view it on GitHub https://github.com/liuhengyue/fcsgg/issues/12#issuecomment-1430568448, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJTMZW67X4J4VSSBYLQKFNDWXQOJNANCNFSM6AAAAAAT4M4VPY . You are receiving this because you authored the thread.Message ID: @.***>

zxk19981227 commented 1 year ago

I tried to train the model with all the requirements mentioned in the paper without any modification with 4 v100s. However, the results reported in the last epoch is

SGG eval:     R @ 20: 0.0368;     R @ 50: 0.0535;     R @ 100: 0.0669;  for mode=sgdet, type=Recall(Main).
SGG eval:  ng-R @ 20: 0.0510;  ng-R @ 50: 0.0861;  ng-R @ 100: 0.1238;  for mode=sgdet, type=No Graph Constraint Recall(Main).
SGG eval:    zR @ 20: 0.0089;    zR @ 50: 0.0128;    zR @ 100: 0.0177;  for mode=sgdet, type=Zero Shot Recall.
SGG eval: ng-zR @ 20: 0.0094; ng-zR @ 50: 0.0172; ng-zR @ 100: 0.0243;  for mode=sgdet, type=No Graph Constraint Zero Shot Recall.
SGG eval:    mR @ 20: 0.0177;    mR @ 50: 0.0266;    mR @ 100: 0.0339;  for mode=sgdet, type=Mean Recall.

This result is reproduced under thefcsgg_hrnet_w32_2xdownraf_512x512_ms. So is there any tricky methods that neither mentioned in the paper or github that could help to improve the results?

liuhengyue commented 1 year ago

Did you try changing the number of detections to use for path integral? You could try using the provided checkpoint, and then testing if it gives the results reported in the paper. If not, maybe there are some configurations being changed and I did not notice.

zxk19981227 commented 1 year ago

From my points of views, the K has no effects on relation classification. So could you provided a newer version of code that could reproduce your results?

liuhengyue commented 1 year ago

Trust me, it does. Since it determines the number of objects you use for "relation classification". Low prob object could have higher integral scores also.

Did you try using the provided checkpoint and evaluate?

Did you try other configs? Since the one you are using is the baseline, and multi-scale helps a lot.

I do not have any newer code, and the reported results are based on this repo.

zxk19981227 commented 1 year ago

Yes i tried to modify the K but no significant results exists. Even the provided checkpoint could not perform as well as reported in paper. Also i tried the resnet baseline config with multi head and ask my roommates to reproduce the results. Sadly, the results doesn't converge with config FCSGG-Res50-BiFPN-P2P5-MultiscaleHead-MS.yaml with two gpu a6000 . results are shown as following

liuhengyue commented 1 year ago

Did you freeze any parameter? The box ap indicates that your parameters are basically random not trained.

zxk19981227 commented 1 year ago

Of course i didn't freeze any parameters and i did'nt change any setting in the code. I also beg my classmate to reimplemented the results and he could reimplement my result

liuhengyue / fcsgg

About reproducting the result reported in paper #12

获取 Outlook for iOS https://aka.ms/o0ukef