Open AlexandrosEmvoliadis opened 7 months ago
No, we do not include speech data in training CLAP.
Alexandros Emvoliadis @.***> 于2024年4月1日周一 08:55写道:
Does the speech pre-training considers speech-to-text task? Or is the model being trained for speaker verification?
— Reply to this email directly, view it on GitHub https://github.com/LAION-AI/CLAP/issues/144, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGXJZ746YM4GU4XZQJLMUDLY3FKLFAVCNFSM6AAAAABFRSE7OSVHI2DSMVQWIX3LMV43ASLTON2WKOZSGIYTQMRQGQYTGNQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Thank you for the excellent work! As you mentioned, I’m a bit confused about the speech_music weights. Could you clarify what exactly “speech” refers to in this context? I would really appreciate your response!
As far as I remember that was mostly descriptive caption of speech (speaker, accent, etc), some had the content of the speech (actual text transcription) in it. However, I don't think the model we trained can discriminate the speech's content.
ww-banban @.***> 于2024年11月10日周日 18:58写道:
Thank you for the excellent work! As you mentioned, I’m a bit confused about the speech_music weights. Could you clarify what exactly “speech” refers to in this context? I would really appreciate your response!
— Reply to this email directly, view it on GitHub https://github.com/LAION-AI/CLAP/issues/144#issuecomment-2467135535, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGXJZ7ZQWCRVC5KTCQNELZT2AAMPHAVCNFSM6AAAAABRQ4TKN2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINRXGEZTKNJTGU . You are receiving this because you commented.Message ID: @.***>
This is extremely helpful, thank you so much!
Does the speech pre-training considers speech-to-text task? Or is the model being trained for speaker verification?