Example using received audio data?

asg0451 commented 3 years ago

I'm hacking on this example (https://github.com/serenity-rs/songbird/blob/current/examples/serenity/voice_receive/src/main.rs), and I can receive and buffer voice audio, but I can't figure out what format it's in or how to use it.

I've got some of these https://serenity-rs.github.io/songbird/current/songbird/events/context_data/struct.VoiceData.html#structfield.audio but what is this? How can I turn this into, say, a .wav file?

I've been banging my head on this one for a couple hours so any help would be greatly appreciated!

FelixMcFelix commented 3 years ago

This should be documented somewhere, you're right!

Each event contains up to 20ms of mono 16-bit PCM audio from a single user at 48kHZ -- each i16 is a sample.

If you want to make a wave file per user:

You need to emit a suitable wav header.
- This is documented here, or you can look into a Rust library for this, e.g. wav.
The i16 data should just go into the data subchunk without any issues, just be careful around endianness. However, you need to pad silent regions with zeroes by yourself, because clients don't send packets when they aren't talking. Mostly. 🙂

If you want to combine them all:

As above, except you need to need to add all packets which arrive in the same 20ms window together.
- Mixing two audio packets is simply summing over all input measurements.
- Take care not to overflow/clip the additions -- you might need to do volume reduction.
Some other users have tackled this for e.g., bridging Discord<->TS3.

asg0451 commented 3 years ago

wow, thanks! this is incredibly helpful. re your second bullet, can you elaborate on being careful around endianness? This PCM data is big-endian, right? i think i saw that implied somewhere in the discord docs, such as they are.

FelixMcFelix commented 3 years ago

The PCM data you receive is in the endianness of your machine; you just need to be careful saving it to disk (i.e., "RIFF vs "RIFX" determines this in the WAV header, and I think the wav library just handles this for you).

On Tue, 12 Oct 2021 at 15:07, Miles Frankel @.***> wrote:

wow, thanks! this is incredibly helpful. re your second bullet, can you elaborate on being careful around endianness? This PCM data is big-endian, right? i think i saw that implied somewhere in the discord docs, such as they are.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/serenity-rs/songbird/issues/100#issuecomment-941047159, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABT75FLMAWHZP4B6VPB6OBTUGQ6J7ANCNFSM5FZ2U5WQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

asg0451 commented 3 years ago

Ah gotcha. Thanks again, you have saved me hours of struggling with opus and discord documentation. If you want, I could do a quick PR to add basically what you said to the bit of documentation I linked initially.

FelixMcFelix commented 3 years ago

Thanks; that would be very helpful if you could do so.

serenity-rs / songbird

Example using received audio data? #100