bhj / KaraokeEternal

Open karaoke party system
https://www.karaoke-eternal.com
ISC License
452 stars 67 forks source link

Big suggestion: record and stream live audio to server, and do audio mixing and de-reverberation on the server #20

Closed xuancong84 closed 3 years ago

xuancong84 commented 4 years ago

HTML5 now support streaming live audio/video. You can use the technique in https://www.html5rocks.com/en/tutorials/getusermedia/intro/ to allow each user to turn on their phone microphone and stream live audio to server. Then on the server side, you can do some signal processing to mix the MTV audio with each user's singing voice with de-reverberation, after which the mixed audio data is streamed to the player for playback. Then, we truly have a Home Karaoke system -:)

xuancong84 commented 4 years ago

FYI, if you do that, I am glad to help you on the signal processing part ^_^

bhj commented 4 years ago

It's definitely an interesting idea! For now, the scope of the app is purposefully limited to in-person parties and making that experience as great as possible. We do already have users successfully using it over the web along with streaming apps like Zoom, though. This may ultimately be the better solution as different users will have different streaming goals (do they want to be able to see each other via webcams, etc.)

One big issue that comes to mind is latency. Even in-person, the latency with most USB audio interfaces is on the edge of what I consider acceptable (Thunderbolt is much better). This is something streaming apps have already heavily optimized and I'm not sure could be matched with in-browser APIs, but I agree we are closer than ever to this being possible!

Appreciate the interest, and I'll leave this open for now to collect further ideas.

xuancong84 commented 4 years ago

Yup, you are right that the latency is a big issue. With a good router and local area network, the latency of data transfer can be reduced significantly. However, compared to analog system which send the audio signal immediately, the nature of digital system is that audio is recorded first (into frames of for example every 0.5s) and then the 0.5s audio buffer is transmitted. That will cause at least 0.5s latency. We need to increase the frame rate so as to send the recorded audio as soon as possible by running a continuous loop, but that will heat up phone a lot. So no good solution so far, haha!

Thanks and glad to hear that you are interested towards this direction as well -:)

bhj commented 4 years ago

I'm not sure a browser will ever provide that level of granularity since audio buffer size is very dependent on the particular device and driver, but IMO half the fun of karaoke is hearing oneself amplified though speakers, with effects, and that needs to be done in <3ms for it not to feel weird, in my experience. Totally doable in-person with wires, but otherwise all bets are off. I'm all for making remote "karaoke" as good as can be, but I hesitate to call it karaoke at that point. Still, PRs accepted! ;)

bhj commented 3 years ago

Closing this for now, but I definitely think it's worth revisiting if/when we start handling audio input. Thanks!