Closed Schwarzi9 closed 9 months ago
Please read the "Whisper languages" section on the start page (Readme.md). There is a description how to use other languages from huggingface. Also look in this discussion https://github.com/rakuri255/UltraSinger/discussions/63 and maybe post the best working language model to help others.
Hey, thanks for your answer. Since I have very little knowledge on python I had to go though instructions few times more. If maybe you could edit example for romanian in the instructions. I copy pasted as its written there, but you forgot to put whisper infront of align model, thats why it didnt work for me.
-i XYZ --align_model "gigant/romanian-wav2vec2" didn't work for me
-i XYZ --whisper_align_model did work for me.
Thanks anyway, its working now!!!!
Thanks for the info! I fixed the readme
Hello, I have a problem with creating ultrastar songs in slovenian language. It works in English and German.
I am not good with python, even though i managed to install Ultrasinger, I just don't know how to make it work for songs in Slovenian language.
If it is impossible to do it for songs in Slovenian language, can i force it to use another language even if whole lyrics will need editing?
This is written at the end of the log: There is no default alignment model set for this language (sl). Please find a wav2vec2.0 model finetuned on this language in https://huggingface.co/models, then pass the model name in --align_model [MODEL_NAME] No default align-model for language: sl [UltraSinger] Error: Unknown language. Try add it with --align_model [huggingface].
how do i do this? Thank you!
This is the whole log:
(.venv) C:\Users>py Ultrasinger.py -i https://www.youtube.com/watch?v=6y9S9RipcUY --language sl The torchaudio backend is switched to 'soundfile'. Note that 'sox_io' is not supported on Windows. The torchaudio backend is switched to 'soundfile'. Note that 'sox_io' is not supported on Windows.
[UltraSinger] [UltraSinger] UltraSinger Version: 0.0.9 [UltraSinger] [UltraSinger] Checking GPU support for tensorflow and pytorch. [UltraSinger] tensorflow - using cuda gpu. [UltraSinger] pytorch - using cuda gpu. [UltraSinger] full automatic mode [youtube] Extracting URL: https://www.youtube.com/watch?v=6y9S9RipcUY [youtube] 6y9S9RipcUY: Downloading webpage [youtube] 6y9S9RipcUY: Downloading ios player API JSON [youtube] 6y9S9RipcUY: Downloading android player API JSON [youtube] 6y9S9RipcUY: Downloading m3u8 information [UltraSinger] Searching song in musicbrainz [UltraSinger] cant find title carpe diem in joker out official video esc 2023 [UltraSinger] No match found [UltraSinger] Creating output folder. -> C:\Users\output\Joker Out - Carpe Diem (Official Video ESC 2023) (2)[UltraSinger] Downloading Audio [youtube] Extracting URL: https://www.youtube.com/watch?v=6y9S9RipcUY [youtube] 6y9S9RipcUY: Downloading webpage [youtube] 6y9S9RipcUY: Downloading ios player API JSON [youtube] 6y9S9RipcUY: Downloading android player API JSON [youtube] 6y9S9RipcUY: Downloading m3u8 information [info] 6y9S9RipcUY: Downloading 1 format(s): 251 [download] Destination: C:\Users\output\Joker Out - Carpe Diem (Official Video ESC 2023) (2)\Joker Out - Carpe Diem (Official Video ESC 2023) [download] 100% of 2.75MiB in 00:00:00 at 7.40MiB/s [ExtractAudio] Destination: C:\Users\output\Joker Out - Carpe Diem (Official Video ESC 2023) (2)\Joker Out - Carpe Diem (Official Video ESC 2023).mp3 Deleting original file C:\Users\output\Joker Out - Carpe Diem (Official Video ESC 2023) (2)\Joker Out - Carpe Diem (Official Video ESC 2023) (pass -k to keep) [UltraSinger] Downloading Video [youtube] Extracting URL: https://www.youtube.com/watch?v=6y9S9RipcUY [youtube] 6y9S9RipcUY: Downloading webpage [youtube] 6y9S9RipcUY: Downloading ios player API JSON [youtube] 6y9S9RipcUY: Downloading android player API JSON [youtube] 6y9S9RipcUY: Downloading m3u8 information [info] 6y9S9RipcUY: Downloading 1 format(s): 625+140 [hlsnative] Downloading m3u8 manifest [hlsnative] Total fragments: 39 [download] Destination: C:\Users\output\Joker Out - Carpe Diem (Official Video ESC 2023) (2)\Joker Out - Carpe Diem (Official Video ESC 2023).f625.mp4 [download] 100% of 273.45MiB in 00:00:34 at 8.04MiB/s [download] Destination: C:\Users\output\Joker Out - Carpe Diem (Official Video ESC 2023) (2)\Joker Out - Carpe Diem (Official Video ESC 2023).f140.m4a [download] 100% of 2.73MiB in 00:00:00 at 7.75MiB/s [Merger] Merging formats into "C:\Users\output\Joker Out - Carpe Diem (Official Video ESC 2023) (2)\Joker Out - Carpe Diem (Official Video ESC 2023).mp4" Deleting original file C:\Users\output\Joker Out - Carpe Diem (Official Video ESC 2023) (2)\Joker Out - Carpe Diem (Official Video ESC 2023).f625.mp4 (pass -k to keep) Deleting original file C:\Users\output\Joker Out - Carpe Diem (Official Video ESC 2023) (2)\Joker Out - Carpe Diem (Official Video ESC 2023).f140.m4a (pass -k to keep) [UltraSinger] Downloading thumbnail [youtube] Extracting URL: https://www.youtube.com/watch?v=6y9S9RipcUY [youtube] 6y9S9RipcUY: Downloading webpage [youtube] 6y9S9RipcUY: Downloading ios player API JSON [youtube] 6y9S9RipcUY: Downloading android player API JSON [youtube] 6y9S9RipcUY: Downloading m3u8 information [UltraSinger] Creating output folder. -> C:\Users\output\Joker Out - Carpe Diem (Official Video ESC 2023) (2)\cache [UltraSinger] Separating vocals from audio with demucs and cuda as worker. Important: the default model was recently changed to
htdemucs
the latest Hybrid Transformer Demucs model. In some cases, this model can actually perform worse than previous models. To get back the old default model use-n mdx_extra_q
. Selected model is a bag of 1 models. You will see that many progress bars per track. Separated tracks will be stored in C:\Users\separated\htdemucs Separating track C:\Users\output\Joker Out - Carpe Diem (Official Video ESC 2023) (2)\Joker Out - Carpe Diem (Official Video ESC 2023).mp3 100%|██████████████████████████████████████████████████████████████████████| 181.35/181.35 [00:06<00:00, 27.86seconds/s][UltraSinger] Converting wav to mp3 [UltraSinger] Reduce noise from vocal audio with ffmpeg. [UltraSinger] Converting audio for AI [UltraSinger] Mute audio parts with no singing [UltraSinger] Loading whisper with model large-v2 and cuda as worker Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.2.0. To apply the upgrade to your files permanently, runpython -m pytorch_lightning.utilities.upgrade_checkpoint C:\Users\.cache\torch\whisperx-vad-segmentation.bin
Model was trained with pyannote.audio 0.0.1, yours is 3.1.1. Bad things might happen unless you revert pyannote.audio to 0.x. Model was trained with torch 1.10.0+cu102, yours is 2.0.1+cu117. Bad things might happen unless you revert torch to 1.x.[UltraSinger] Transcribing C:\Users\output\Joker Out - Carpe Diem (Official Video ESC 2023) (2)\cache\Joker Out - Carpe Diem (Official Video ESC 2023)_mute.wav There is no default alignment model set for this language (sl). Please find a wav2vec2.0 model finetuned on this language in https://huggingface.co/models, then pass the model name in --align_model [MODEL_NAME] No default align-model for language: sl [UltraSinger] Error: Unknown language. Try add it with --align_model [huggingface].