Test code - Githubissues

Dobby114 commented 2 years ago

Thank you for sharing the code . I want to know the exact test code for task1 on LSMDC dataset. After I finished training, I tried to use the following code to test, but I got results that are quite different from those in the paper test code ： python cli.py test_path=TAPM-0/tapm-master/data/LSMDC/task1_2016/LSMDC16_anno_test_someone.csv ckpt_name=feature_names_video_images_fix_gpt_epoch_5_1og_tag_tag_model_no_gt_sos_model_name_no_gt_sos_sample_CIDEr_0.1281_epoch_24.pickle test results: I just changed the batchsize of training, validation and testing to 4, and then changed the "dataloader = dataloaders[key] "in the evaluate_sample function to "dataloader = dataloaders['target'][key]"while testing, otherwise it will report error.

I'm guessing that my test code may have gone wrong causing the wrong results. I would be very grateful if you could help me!

JiwanChung commented 2 years ago

Hi!

Looking at your results, METEOR score does not seem right. Our test script is in code/script/test_task1.py. Please check if it yields reasonable results. (Note that smaller batch size can lead to worse results especially in small datasets such as LSMDC)

On Wed, Jan 12, 2022 at 11:04 PM Dobby114 @.***> wrote:

Thank you for sharing the code . I want to know the exact test code for task1 on LSMDC dataset. After I finished training, I tried to use the following code to test, but I got results that are quite different from those in the paper test code ： python cli.py test_path=TAPM-0/tapm-master/data/LSMDC/task1_2016/LSMDC16_anno_test_someone.csv ckpt_name=feature_names_video_images_fix_gpt_epoch_5_1og_tag_tag_model_no_gt_sos_model_name_no_gt_sos_sample_CIDEr_0.1281_epoch_24.pickle test results: [image: 12] https://user-images.githubusercontent.com/66785750/149152720-c3bfb673-83a2-4ee2-97f2-781a70d82b35.png I just changed the batchsize of training, validation and testing to 4, and then changed the "dataloader = dataloaders[key] "in the evaluate_sample function to "dataloader = dataloaders['target'][key]", otherwise it will report error.

I'm guessing that my test code may have gone wrong causing the wrong results, I would be very grateful if you could help me

— Reply to this email directly, view it on GitHub https://github.com/JiwanChung/tapm/issues/15, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE36FWGTS4EHPP4TS7CMFALUVWDAFANCNFSM5LZB7BYQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>

Dobby114 commented 2 years ago

你好！查看您的结果，METEOR 分数似乎不正确。我们的测试脚本在 code/script/test_task1.py 中。请检查它是否产生合理的结果。（请注意，较小的批大小会导致更差的结果，尤其是在 LSMDC 等小型数据集中） … On Wed, Jan 12, 2022 at 11:04 PM Dobby114 @.> wrote: Thank you for sharing the code . I want to know the exact test code for task1 on LSMDC dataset. After I finished training, I tried to use the following code to test, but I got results that are quite different from those in the paper test code ： python cli.py test_path=TAPM-0/tapm-master/data/LSMDC/task1_2016/LSMDC16_anno_test_someone.csv ckpt_name=feature_names_video_images_fix_gpt_epoch_5_1og_tag_tag_model_no_gt_sos_model_name_no_gt_sos_sample_CIDEr_0.1281_epoch_24.pickle test results: [image: 12] https://user-images.githubusercontent.com/66785750/149152720-c3bfb673-83a2-4ee2-97f2-781a70d82b35.png I just changed the batchsize of training, validation and testing to 4, and then changed the "dataloader = dataloaders[key] "in the evaluate_sample function to "dataloader = dataloaders['target'][key]", otherwise it will report error. I'm guessing that my test code may have gone wrong causing the wrong results, I would be very grateful if you could help me — Reply to this email directly, view it on GitHub <#15>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE36FWGTS4EHPP4TS7CMFALUVWDAFANCNFSM5LZB7BYQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub >。您收到此消息是因为您订阅了此线程。消息 ID：@.>

Thank you for your patient answer, I found that the results above are in the sentence-level.I tried to change the training epoch to the original 8, retrained and got the results as below The result doesn't look right, far smaller than the results in your paper. I followed the instructions to train and the evaluation commend I used was： python cli.py evaluate with test_path=data/LSMDC/task1/LSMDC16_annos_test_someone.csv ckpt_name=feature_names__video_images__fix_gpt_epoch_5_log_tag_tag_model_no_gt_sos_model_name_no_gt_sos_sample/CIDEr_0.1258_epoch_24.pickle I don't know where is the problem

JiwanChung commented 2 years ago

Hi!

Since the CIDEr score should reach ~14 even without TAPM, your result does seem a bit off. I will look into the issue to find possible debugging directions.

kfoekoijfi commented 2 years ago

Hi!

Since the CIDEr score should reach ~14 even without TAPM, your result does seem a bit off. I will look into the issue to find possible debugging directions.

Yes, I just pulled down the code from GitHub ，trained and tested it as required without changing any special parameters. This model is a little complicated to me. I don't know how to debug and get the correct results.

JiwanChung commented 2 years ago

We previously used our own ResNext features for the challenge results. We assumed the official features for LSMDC would work as well with the code. However, upon inspection they turned out to cause caption degeneration. Hence we share our features, which are extracted with ResNext. I will update this information in README too. (The shared file also contains new i3d rgb features, but the ResNext features alone would prove sufficient for getting the correct scores.)

https://drive.google.com/uc?id=1dqfpX76QxVnOOThgY8-cuSG5m-0LVZZk

Dobby114 commented 2 years ago

Thank you very much for your patience and generous sharing

JiwanChung / tapm

Test code #15