open-mmlab / Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

https://openhlt.github.io/amphion/

MIT License

7.71k stars 586 forks source link

[Help]: Sample code for TTA #336

Open cpken opened 2 weeks ago

cpken commented 2 weeks ago

TTA 的示例代码

在 https://audit-demo.github.io/ 里看到了很多 TTA 的示例，但是在项目中，仅提供了【文本生成音频】的示例，没有提供其它的示例，如：

添加：将另一个声音事件添加到输入音频。
删除：从输入音频中删除一个或多个声音事件。
替换：用另一声音事件替换输入音频中的一个声音事件。
修复：基于上下文或提供的文本描述来完成音频的掩蔽片段。
音频超分辨率任务可以被视为完成低采样输入音频的高频信息（将低采样输入音频转换为高采样输出音频）。

希望能增加更多的示例，谢谢。

feifei788 commented 2 weeks ago

请说明AudioCaps文件夹下的valid.json文件的格式，谢谢！

cpken commented 2 weeks ago

项目里没有找到【AudioCaps文件夹下的valid.json】

feifei788 commented 2 weeks ago

源码并没有说明valid.json文件的格式示例，运行时报错：FileNotFoundError: [Errno 2] No such file or directory: 'data\AudioCaps\valid.json' 能否发一个valid.json文件的代码示例

cpken commented 2 weeks ago

抱歉老铁，可能无法帮助你，不清楚你在测试哪些实例。

feifei788 commented 2 weeks ago

$ sh egs/tta/autoencoderkl/run_train.sh 我在测试tta/autoencoderkl下的run_train.sh文件

cpken commented 2 weeks ago

目前还未尝试进行训练测试，训练数据好像是在这里获取 https://github.com/open-mmlab/Amphion/blob/main/egs/datasets/README.md

yuantuo666 commented 2 weeks ago

Hi, we prefer English issues so that more people worldwide can participate, and it is also easier to search for issues.

@HeCheng0625 Could you help with this?