This submission is too late to be considered valid, but I spent a long time on this so I thought I would share the progress. I forgot to mention that it can allow you to upload a link to any spotify song and everything still works like in the demo video.
How It Works
User either enters link to spotify song or uploads an audio of the song. Server takes the song, then extracts stuff like artist, song album and song name. Then server sends the song to an mdx-net model hosted on huggingface gradio spaces. The model seperates the vocal track from the background track and sends the result back to the server. The server then obtains the lyrics for the song by first querying https://lrclib.net/ with info about the song. If https://lrclib.net/ doesnt have the song, then the server uses azure speech transcription service to transcribe the song vocal track. The server then sends back the lyrics, and links to the vocal and background tracks to the client which starts playing the background track, while displaying the lyrics for the user to sing along. The user can also toggle the vocal track while the background track is playing, if he/she wants to "peek" the audio at that point.
While the server is processing the song, the client uses Server sent events to get updates on the current processing log. To identify the song to ask updates for, the client has to use the song sha1 hash. This decision came back to bite my ass, because I discovered that if I needed to put this on a website, it had to be a https site, since web crypto module only works in "secure" contexts. But since my server is just a bare bones IP address, I couldnt use the web crypto module when I host the webapp.
So to allow easy use for vjeux, I have shared my .env file with your gmail account. Just follow the steps on the repo readme file and run it locally on your machine.
link to env file shared with vjeux
Known Bugs
If you keep trying out different songs under the same browser session, mind bending bugs start occuring. I advise to just create a new tab after you try out like 2 songs.
Some audio may fail to process now, but succeed later and vice versa.
The vocal and lyric feels like they lag some miliseconds behind the background track during playback
repository video
This submission is too late to be considered valid, but I spent a long time on this so I thought I would share the progress. I forgot to mention that it can allow you to upload a link to any spotify song and everything still works like in the demo video.
How It Works
User either enters link to spotify song or uploads an audio of the song. Server takes the song, then extracts stuff like artist, song album and song name. Then server sends the song to an mdx-net model hosted on huggingface gradio spaces. The model seperates the vocal track from the background track and sends the result back to the server. The server then obtains the lyrics for the song by first querying https://lrclib.net/ with info about the song. If https://lrclib.net/ doesnt have the song, then the server uses azure speech transcription service to transcribe the song vocal track. The server then sends back the lyrics, and links to the vocal and background tracks to the client which starts playing the background track, while displaying the lyrics for the user to sing along. The user can also toggle the vocal track while the background track is playing, if he/she wants to "peek" the audio at that point.
While the server is processing the song, the client uses Server sent events to get updates on the current processing log. To identify the song to ask updates for, the client has to use the song sha1 hash. This decision came back to bite my ass, because I discovered that if I needed to put this on a website, it had to be a https site, since web crypto module only works in "secure" contexts. But since my server is just a bare bones IP address, I couldnt use the web crypto module when I host the webapp.
So to allow easy use for vjeux, I have shared my
.env
file with your gmail account. Just follow the steps on the repo readme file and run it locally on your machine. link to env file shared with vjeuxKnown Bugs
If you keep trying out different songs under the same browser session, mind bending bugs start occuring. I advise to just create a new tab after you try out like 2 songs.
Some audio may fail to process now, but succeed later and vice versa.
The vocal and lyric feels like they lag some miliseconds behind the background track during playback