mpc001 / auto_avsr

Auto-AVSR: Lip-Reading Sentences Project
Apache License 2.0
158 stars 40 forks source link

Re-implementation error #23

Open jeonhuhuhu opened 8 months ago

jeonhuhuhu commented 8 months ago

Does the problem previously posed as a question(https://github.com/mpc001/auto_avsr/issues/20) affect performance?

I'm re-training with a newly updated code.

Here, as a question, I'm using A100 GPUs(4) to perform training,

so I'm wondering if it's the right way to perform training by 8 times less than the A100 GPUs(32) you used.

And, we're training your code countless times without modifying it, but 96.6% like [vsr_trlrs3_23h_base.pth] is not coming out, only 99.4% is coming out as a result, and I need some advice.

mpc001 commented 8 months ago

Hi @jeonhuhuhu, #20 is about training and evaluating an audio-visual model. As mentioned in #20, an error will be raised if the bug is not fixed.

Can you try to fine-tune on full lrs3 based on the model you have? You can pass [vsr_trlrs3_23h_base.pth] or the model you have to pretrained_model_path argument for fine-tuning