hulianyuyy / Temporal-Lift-Pooling

Temporal Lift Pooling for Continuous Sign Language Recognition (ECCV2022)
16 stars 3 forks source link

The performance of Phoenix14 on dev and test set #1

Closed sakura2233565548 closed 2 years ago

sakura2233565548 commented 2 years ago

Hi, thanks for your contribution in this community. I test your performance with your provide code and pth file, but the model parameter can not load in the model. I fix the temporal_pool block with your pth file and finally get 21.0 on dev set and 22.3 on test set. Can you check the pth file correctly?Thanks a lot!

hulianyuyy commented 2 years ago

I have checked the implementations for lift pooling and found i wrongly deleted some lines when deleting useless codes. I update line 12, line 19, line 29-32 and line 37 in modules/tconv.py. You can redownload it for testing. But i found another confusing problem. The accuracy of TLP seems to vary slightly when testing (may be due to the randomness in the training algorithm?) and i found the dev wer range from 19.7-20.0 on the PHOENIX14 dataset and 18.9-19.4 on the PHOENIX14-T dataset. I will focus on fixing it latter.

hulianyuyy commented 2 years ago

I also corrected the configs/baseline.yaml to change the training epochs to 80 with steps of [40, 60]

sakura2233565548 commented 2 years ago

Thanks for your correctness! I will test the new code

sakura2233565548 commented 2 years ago

I test the PHOENIX14 dataset with wer 21.0 on dev set and were 22.3 on test set even with the fixed code. I confuse about the line 37 in modules/tconv.py, this is a useless code. I can not understand this code.

hulianyuyy commented 2 years ago

I redownload and test the code from github, and i get 19.7, 19.9, 20.2 wer on the phoenix14 dataset. I also test the weights with my original code, and get 19.7, 19.8, 19.9 wer on the phoenix14 dataset. I'm confused about such difference. You may test with my original code and check your results

hulianyuyy commented 2 years ago

You may also find the usage of line 37 in modules/tconv.py which are cooperated with the deleted codes.

sakura2233565548 commented 2 years ago

I check the code following your suggestion, but i still get 21.0, 22.3 on were on dev and test of phoenix14 dataset. I found the bug of the code in line 61 in seq_scripts.py, model.train() -> model.eval(), I think the result of the final result needs test under this setting. I have tested the result with this setting, and get worse results than 21.0. Could you fix this bug and release the results of each dataset?

sakura2233565548 commented 2 years ago

And could you provide the temp.stm and temp2.ctm for the evaluation? I can check if 'sclite software' is different from yours?

hulianyuyy commented 2 years ago

The bug in line 61 in seq_scripts.py was previously modified to test the performance on CSL dataset, following the suggestions in this. This is modified after i finished the experiments on the phoenix and phoenix14-T datasets. Thus you don't need to worry about it. Normally, it should be model.train(). After fixing this in the code, i.e., using model.train(), i strictly get 19.7 wer on the phoenix14 dataset and 19.4 wer on the PHOENIX14-T dataset, upon three different testing iterations with the code downloaded from github. Now i wonder your performance may be attributed to the evaluation tool or dataset files? you can download the preprocessed files here, place them under ./preprocess/phoenix2014 and check the results. The tmp.stm and tmp2.stm are provided.

sakura2233565548 commented 2 years ago

Thanks for your help! I will response the result when I check the results of my code! If model.train() is activate, I need the batch_size and device number to better reproduce the eval result. Because the performance may be larger jittering if the model is in training state.

hulianyuyy commented 2 years ago

The batch size is set as 2 by default (in the baseline.yaml). I Only use one gpu to during training or testing.

---Original--- From: "weichao @.> Date: Wed, Aug 3, 2022 15:20 PM To: @.>; Cc: @.>;"State @.>; Subject: Re: [hulianyuyy/Temporal-Lift-Pooling] The performance of Phoenix14on dev and test set (Issue #1)

Thanks for your help! I will response the result when I check the results of my code! If model.train() is activate, I need the batch_size and device number to better reproduce the eval result. Because the performance may be larger jittering if the model is in training state.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.***>

sakura2233565548 commented 2 years ago

I have sloved my problem, and the result is correct! Thanks for your help. I will close this issue~