Closed oscardoudou closed 1 year ago
@oscardoudou The repeated dialogues is caused by different characters being detected by the traditional chinese model in subsequent frames, resulting in lines that are sufficiently different getting outputted separatedly.
Outside of improving the accuracy of the traditional chinese model which I imagine might be pretty challenging/time consuming, you could try lowering the sim_threshold
threshold parameter to relax how similar lines need to be in order for them to get merged together.
vs
The former I use language code
ch
but some character are wrongly detected. So I figure I should change to accurate subtitle language code. The latter is same video with correct language codechinese_cht
, but timeline mess up. I got repeated dialogues which are supposed to be one single continuous dialogue. Though some characters are now detected correctly, eg. 在智比赛 is now corrected detected as 練習比賽.Any idea what parameter I should tweak or bc model for traditional chinese has some issue? Thanks! Appreciate your work.