frank613 / CTC-based-GOP

This repo related to the paper "A Framework for Phoneme-Level Pronunciation Assessment Using CTC" for INTERSPEECH2024
13 stars 2 forks source link

About the My_Wav2Vec2Processor and the other related customized module #1

Open a2d8a4v opened 2 months ago

a2d8a4v commented 2 months ago

Hi, thank you for publishing such a great work! I just want to make sure whether the customized My_WavVec2CTCTokenizer is a phoneme-level tokenizer, which contains only phoneme inventory. In the file CTC-based-GOP/ctc-ASR-training/train_ctc.py at line 237

Sincerely,

frank613 commented 2 months ago

Hi, thanks for asking. Yes, it is a phoneme tokenizer, I will update the repo with the missing python module later today.

Regards, Xinwei

ヤンヤン @.***> 于 2024年9月2日周一 15:12写道:

Hi, thank you for publishing such a great work! I want to make sure the customized My_WavVec2CTCTokenizer is a phoneme-level tokenizer, which contains only phoneme inventory. In the file CTC-based-GOP/ctc-ASR-training/train_ctc.py at line 237 https://github.com/frank613/CTC-based-GOP/blob/f17b2d28a4453eb1868768c01d02d78e87f9d9f4/ctc-ASR-training/train_ctc.py#L237

Sincerely,

— Reply to this email directly, view it on GitHub https://github.com/frank613/CTC-based-GOP/issues/1, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACIXU5ZY7CLYXPBTZXHTNMTZURI3LAVCNFSM6AAAAABNQGR6WWVHI2DSMVQWIX3LMV43ASLTON2WKOZSGUYDAOBRGMZTMOI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

a2d8a4v commented 2 months ago

Thank you a lot! By the way, have you tried to fine-tune the CTC-based model on TIMIT or other corpora at the phoneme level? I would like to know the influence of such corpus influences due to their design or identity. (free speaking scenario or read-aloud)

frank613 commented 2 months ago

Hello. We only experimented with fine-tuning the wav2vec2.0 with the train-100-clean from Librispeech.