rakuri255 / UltraSinger

AI based tool to convert vocals lyrics and pitch from music to autogenerate Ultrastar Deluxe, Midi and notes. It automatic tapping, adding text, pitch vocals and creates karaoke files.
MIT License
287 stars 26 forks source link

Ghosting in text #46

Open rakuri255 opened 1 year ago

rakuri255 commented 1 year ago
          Not complete but it is really better now.

If not text come he has a funny feature inside .. 2420 : 2652 13 6 I : 2668 19 24 don't 2687 : 2753 14 5 know 2767 : 2923 4 11 what : 2928 2 12 to : 2932 3 11 do 2935

Originally posted by @McMuffin88 in https://github.com/rakuri255/UltraSinger/issues/19#issuecomment-1567526717

achimmihca commented 1 year ago

I think this is about missing words from the speech recognition or pitch detection?

If yes, then I suggest to use configurable fallback values, e.g. an underscore for the text, a pitch of 0, and a note length of 1.

Whatever fallback values are used, it should produce a valid note syntax with : startbeat length word

rakuri255 commented 1 year ago

@achimmihca yes that would be a good idea.

I see 2 things here:

  1. When audio has some noise, than whisper hallucinate and adds random words.
  2. Whisper sometimes adds YouTube subtitles. It is only noticeable in places where there is additional information in the subtitle and no voice.
achimmihca commented 1 year ago

Whisper sometimes adds YouTube subtitles

That's is unexpected but actually may produce better results than speech recognition alone.