Open ShiromiyaG opened 8 months ago
Are you sure that the input_file had 44100 sampling rate? The current code doesn’t resample automatically.
@MohannadEhabBarakat Yes, I'm sure, I don't think I've ever used Hi-Res audio in separation. All the audio I use comes from Deezer
@MohannadEhabBarakat And also, I used the same audio in both Windows and Colab and had different results, which I found strange. Maybe it's something to do with package versions. Here is the requirements file that I used in colab. I'm going to test it today with VR models with the same package versions, and write what the results were requirements.txt
I just tested with two models, a VR (karokee_4band_v2_sn) and an MDX (Reverb HQ), and both gave normal results. I remembered that in the last tests I did, I used videos from YT, not from Deezer, but I don't think this is a problem, since the normal results from VR and MDX were using a video from YT
I was testing the HQ4, it also has this same problem, both on Windows and Linux. It looks like the semitone_shift is wrong. Also, this message apear
C:\Users\Guilherme\anaconda3\lib\site-packages\uvr\models_dir\mdx\mdx_interface.py:270: RuntimeWarning: invalid value encountered in divide
tar_waves = result / divider
@MohannadEhabBarakat And also, I used the same audio in both Windows and Colab and had different results, which I found strange. Maybe it's something to do with package versions. Here is the requirements file that I used in colab. I'm going to test it today with VR models with the same package versions, and write what the results were requirements.txt
I think that might be caused because of package versions or resampling algorithms. I noticed that UVR GUI used different resampling according to the OS. I'm not sure why they did it but I just followed them to replicate the same results. For the package versions unfortunately even using the same versions might not solve the issue; As some libraries will have different implementations on different OSs (even with the same version). The workaround that worked for me in the past was to wrap everything in a docker file. Which is basically unifying the OS.
As I'm back now I'll be working on:
So if you can send me an email with your findings and the current bugs, it will help me a lot 🤗. Mohannad.Barakat@fau.de
@MohannadEhabBarakat And also, I used the same audio in both Windows and Colab and had different results, which I found strange. Maybe it's something to do with package versions. Here is the requirements file that I used in colab. I'm going to test it today with VR models with the same package versions, and write what the results were requirements.txt
I think that might be caused because of package versions or resampling algorithms. I noticed that UVR GUI used different resampling according to the OS. I'm not sure why they did it but I just followed them to replicate the same results. For the package versions unfortunately even using the same versions might not solve the issue; As some libraries will have different implementations on different OSs (even with the same version). The workaround that worked for me in the past was to wrap everything in a docker file. Which is basically unifying the OS.
As I'm back now I'll be working on:
- Fixing the bugs you found
- Adding new docs
- Adding new weights (at least the ones you tested)
So if you can send me an email with your findings and the current bugs, it will help me a lot 🤗. Mohannad.Barakat@fau.de
I can try to help, but I don't know if it would be of much help, since I don't use most models, and I end up using only specific ones. In fact, I tested a model that is not available in the UVR repository, but that works both in UVR and in your code. If you want to take a look at this model I'm referring to, I uploaded it to the link below: https://github.com/ShiromiyaG/RVC-AI-Cover-Maker/releases (its the karokee model)
I was testing the MDX23C-8KFFT-InstVoc_HQ on Google Colab, and I was surprised when I heard the result, the audio was slow, the singer was singing slowly and the audio length was longer. I tested the same song on Windows with the same settings, and the results were normal. Here, the code I used, both in Colab and on Windows:
Here, the link to the songs: https://drive.google.com/drive/folders/11aete_dd56XqR68P2cr_BMRlPhvHb7W0?usp=drive_link
And also an Audacity photo of the songs: