FunAudioLLM / CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
https://funaudiollm.github.io/
Apache License 2.0
4.98k stars 507 forks source link

phoneme timestamp #269

Open CuiRobert opened 1 month ago

CuiRobert commented 1 month ago

Is there a way to return the word timestamp of a sentence?

example: input sentence: "Hello readers,welcome!" output: [{ "word": "Hello", "start_time": 0.02, "end_time": 0.36, }, { "word": "readers", "start_time": 0.36, "end_time": 0.855, }, { "word": ",", "start_time": 0.855, "end_time": 1.155, "type": "mark" }, { "word": "welcome", "start_time": 1.155, "end_time": 1.665, }, { "word": "!", "start_time": 1.665, "end_time": 1.955, } ]

aluminumbox commented 1 month ago

there is no word level timestamp

github-actions[bot] commented 2 weeks ago

This issue is stale because it has been open for 30 days with no activity.