open-mmlab / Amphion

Amphion (/Γ¦mˈfaΙͺΙ™n/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
https://openhlt.github.io/amphion/
MIT License
4.5k stars 386 forks source link

Added data padding to 'forward', 'inference', and 'get_prosody_featur… #189

Open norabai opened 5 months ago

norabai commented 5 months ago

Fix Issue #188

✨ Description

This PR adds data padding functionality to the forward, inference, and get_prosody_feature methods in our model class. The _pad_data function pads the input audio tensor, making sure that its last dimension is a multiple of the hop length. This is common in audio processing where all frames need to have equal lengths for certain computations or analyses.

To test this PR, run any processes that use the forward, inference, and get_prosody_feature methods and observe if there are any issues or improvements with how the processed audio data aligns with the hop length.

🚧 Related Issues

πŸ‘¨β€πŸ’» Changes Proposed

πŸ§‘β€πŸ€β€πŸ§‘ Who Can Review?

πŸ›  TODO

βœ… Checklist