IAHispano / Applio

A simple, high-quality voice conversion tool focused on ease of use and performance.
https://applio.org
MIT License
1.68k stars 273 forks source link

[Enhancement] Audio tab on Tensorboard #248

Closed ItsMe-TJ closed 6 months ago

ItsMe-TJ commented 8 months ago

On other SVC's such as Diff-SVC, you can preview what the latest model checkpoint sounds like as you are training it, using the audio tab on tensorboard. Is this possible to implement in Applio? It would be incredibly useful.

Vidalnt commented 8 months ago

On other SVC's such as Diff-SVC, you can preview what the latest model checkpoint sounds like as you are training it, using the audio tab on tensorboard. Is this possible to implement in Applio? It would be incredibly useful.

Could you pass me the repository and also a picture of how it looks like what you mention.

ItsMe-TJ commented 8 months ago

On other SVC's such as Diff-SVC, you can preview what the latest model checkpoint sounds like as you are training it, using the audio tab on tensorboard. Is this possible to implement in Applio? It would be incredibly useful.

Could you pass me the repository and also a picture of how it looks like what you mention.

Diffsinger uses the audio tab as well, but slightly different.. here's that repo: https://github.com/openvpi/DiffSinger

Here's a video as well that shows the audio tab, what it looks like and It's purpose: https://youtu.be/Sxt11TAflV0?si=WBwbZUIWpK8M0iEQ&t=1854

Now again, I know Diff-Singer/Diff-SVC is different than RVC, so It might need to be altered, but in the video you can see how it works.. You have the latest checkpoint (and all previous checkpoints, to compare If you wish) and then the ground truth which is just the raw audio file from the dataset. It gives a very good overview on how your model is progressing in Its training.

Vidalnt commented 8 months ago

Do you think maybe the pc supports inference and training at the same time every minute?

ItsMe-TJ commented 8 months ago

Do you think maybe the pc supports inference and training at the same time every minute?

No, not every minute. I believe DiffSinger updates the audio tab every 1000 steps or so, though this can be customized irc.

litsa-the-dancer commented 7 months ago

Well, realistically speaking the demo that diff-svc provided wasn't the best. Regardless you can always stop training or infer some audio whilst training. Having rvc infer demo for each checkpoint would reduce performance by a little bit i would assume. To keep it brief, it isn't really necessary if we take into account that Applio has other features to fix and such. It definitely isn't a bad idea though.

ItsMe-TJ commented 7 months ago

Well, realistically speaking the demo that diff-svc provided wasn't the best. Regardless you can always stop training or infer some audio whilst training. Having rvc infer demo for each checkpoint would reduce performance by a little bit i would assume. To keep it brief, it isn't really necessary if we take into account that Applio has other features to fix and such. It definitely isn't a bad idea though.

I respectfully disagree, and while I understand it's not necessary, neither is the overtraining detection, etc. It's a quality-of-life thing. I totally get that they have other things to focus on, and I'm not expecting this to be implemented anytime soon.

However, as I clarified in my earlier response, it doesn't infer every checkpoint; it's every 1000 steps or so. If it were to infer every checkpoint, yeah, it would definitely impact performance. But, inferring a file from the dataset, usually 5-10 seconds long every 1000 steps or so, the impact on performance is pretty minimal. Sure, you can always pause training to check progress, but having it done automatically and being able to check it on Tensorboard on your phone, etc., just makes things so much more convenient.

Castro15 commented 7 months ago

YES! I want this please, It's the one thing I loved about diff-svc! Being able to check the progress on the audio tab is SO USEFUL

aitronz commented 6 months ago

We're currently encountering numerous conflicts with the audio tab in the version of TensorBoard we're using. As of now, there are no plans to address this issue. Thank you for your understanding