AI based tool to convert vocals lyrics and pitch from music to autogenerate Ultrastar Deluxe, Midi and notes. It automatic tapping, adding text, pitch vocals and creates karaoke files.
I've noticed that crepe only uses the cpu as worker, as opposed to cuda, while whisper and htdemucs are using cuda.
Also, the crepe result while using this method is pretty weak, requiring a lot of tinkering in YASS to get it to the proper pitching.
I've installed the cuda driver from the Readme file, and all requirements seem to have installed properly.
Am I missing something?
[UltraSinger] Loading whisper with model large-v2 and cuda as worker
[UltraSinger] using alignment model infinitejoy/wav2vec2-large-xls-r-300m-romanian
No language specified, language will be first be detected for each audio file (increases inference time).
Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.0.3. To apply the upgrade to your files permanently, run python -m pytorch_lightning.utilities.upgrade_checkpoint --file C:\Users\Admin\.cache\torch\whisperx-vad-segmentation.bin
Model was trained with pyannote.audio 0.0.1, yours is 2.1.1. Bad things might happen unless you revert pyannote.audio to 0.x.
Model was trained with torch 1.10.0+cu102, yours is 2.0.1+cu117. Bad things might happen unless you revert torch to 1.x.
[UltraSinger] Transcribing F:\UltraSinger\output\Ce Seara Minunata\cache\Ce Seara Minunata_denoised.wav
Detected language: ro (0.85) in first 30s of audio...
[UltraSinger] Removing silent start and ending, from transcription data
[UltraSinger] Hyphenate using language code: ro_RO
168it [00:00, 83926.04it/s]
[UltraSinger] Pitching with crepe and model full and cpu as worker
651/651 [==============================] - 130s 199ms/step
[UltraSinger] Creating midi notes from pitched data
[UltraSinger] Creating Ultrastar notes from midi data
[UltraSinger] BPM is 170.45
[UltraSinger] Creating F:\UltraSinger\output\Ce Seara Minunata\Ce Seara Minunata.txt from transcription.
[UltraSinger] Converting wav to mp3
[UltraSinger] Creating F:\UltraSinger\output\Ce Seara Minunata\Ce Seara Minunata [Karaoke].txt from transcription.
[UltraSinger] Parse ultrastar txt -> F:\UltraSinger\output\Ce Seara Minunata\Ce Seara Minunata.txt
[UltraSinger] Calculating Ultrastar Points
[UltraSinger] Simple (octave high ignored) points
[UltraSinger] Total: 5298, notes: 5194, line bonus: 104, golden notes: 0
[UltraSinger] Accurate (octave high matches) points:
[UltraSinger] Total: 5282, notes: 5178, line bonus: 104, golden notes: 0
[UltraSinger] Creating Midi with pretty_midi
[UltraSinger] Creating midi instrument from Ultrastar txt
[UltraSinger] Creating midi file -> F:\UltraSinger\output\Ce Seara Minunata\Ce Seara Minunata.mid
I've noticed that crepe only uses the cpu as worker, as opposed to cuda, while whisper and htdemucs are using cuda. Also, the crepe result while using this method is pretty weak, requiring a lot of tinkering in YASS to get it to the proper pitching. I've installed the cuda driver from the Readme file, and all requirements seem to have installed properly. Am I missing something?
[UltraSinger] Loading whisper with model large-v2 and cuda as worker [UltraSinger] using alignment model infinitejoy/wav2vec2-large-xls-r-300m-romanian No language specified, language will be first be detected for each audio file (increases inference time). Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.0.3. To apply the upgrade to your files permanently, run
python -m pytorch_lightning.utilities.upgrade_checkpoint --file C:\Users\Admin\.cache\torch\whisperx-vad-segmentation.bin
Model was trained with pyannote.audio 0.0.1, yours is 2.1.1. Bad things might happen unless you revert pyannote.audio to 0.x. Model was trained with torch 1.10.0+cu102, yours is 2.0.1+cu117. Bad things might happen unless you revert torch to 1.x. [UltraSinger] Transcribing F:\UltraSinger\output\Ce Seara Minunata\cache\Ce Seara Minunata_denoised.wav Detected language: ro (0.85) in first 30s of audio... [UltraSinger] Removing silent start and ending, from transcription data [UltraSinger] Hyphenate using language code: ro_RO 168it [00:00, 83926.04it/s] [UltraSinger] Pitching with crepe and model full and cpu as worker 651/651 [==============================] - 130s 199ms/step [UltraSinger] Creating midi notes from pitched data [UltraSinger] Creating Ultrastar notes from midi data [UltraSinger] BPM is 170.45 [UltraSinger] Creating F:\UltraSinger\output\Ce Seara Minunata\Ce Seara Minunata.txt from transcription. [UltraSinger] Converting wav to mp3 [UltraSinger] Creating F:\UltraSinger\output\Ce Seara Minunata\Ce Seara Minunata [Karaoke].txt from transcription. [UltraSinger] Parse ultrastar txt -> F:\UltraSinger\output\Ce Seara Minunata\Ce Seara Minunata.txt [UltraSinger] Calculating Ultrastar Points [UltraSinger] Simple (octave high ignored) points [UltraSinger] Total: 5298, notes: 5194, line bonus: 104, golden notes: 0 [UltraSinger] Accurate (octave high matches) points: [UltraSinger] Total: 5282, notes: 5178, line bonus: 104, golden notes: 0 [UltraSinger] Creating Midi with pretty_midi [UltraSinger] Creating midi instrument from Ultrastar txt [UltraSinger] Creating midi file -> F:\UltraSinger\output\Ce Seara Minunata\Ce Seara Minunata.mid