WaveBeans / wavebeans

Audio Processing. On scale.
https://wavebeans.io
Apache License 2.0
23 stars 0 forks source link

End to end examples #128

Closed gad2103 closed 2 years ago

gad2103 commented 2 years ago

Thanks for working on this library. I've been reading the source and issues and would love if there were some examples of simple operations.

To illustrate. I would like to take two byte arrays from WAV files and create a new byte array that is essentially the two "tracks" mixed together. Generally how would I take byte arrays representing tracks and create a single byte array that is a multi track output.

If you can direct me to an answer I'm happy to make a PR with an example in the docs. Thanks again

asubb commented 2 years ago

@gad2103 what you essentially need is to use merge operation. It performs the "combine" operation on corresponding samples of the streams:

sampleStream1
  .merge(sampleStream2) { a, b -> a + b }
  .merge(sampleStream3) { a, b -> a + b }
  .merge(sampleStream4) { a, b -> a + b }

You need to define the + operation (the code in lambda function can be anything, summing is just an example).

In your case it sounds like you need something that contains "multi-channel sample" of some sort. So it would be convenient to define a new class, e.g. MultiSample, that encapsulates sum operations Sample + Sample -> MultiSample and MultiSample + Sample -> MultiSample. At the end you will have the Stream<MultiSample> where you can use let's say map operation to perform further tweaking. And eventually, as you have custom data type in the stream, define the output as function.

So your code would look like this:

sampleStream1 // Stream<Sample>
  .merge(sampleStream2) { a, b -> a + b } // Stream<Sample> + Stream<Sample> -> Stream<MultiSample>
  .merge(sampleStream3) { a, b -> a + b } // Stream<MultiSample> + Stream<Sample> -> Stream<MultiSample>
  .merge(sampleStream4) { a, b -> a + b } // Stream<MultiSample> + Stream<Sample> -> Stream<MultiSample>
  .map {multiSample -> multiSample.downMix() } // Stream<MultiSample>  -> Stream<ReadyToSaveMultiSample>
  .out { ... } // Stream<ReadyToSaveMultiSample> -> Unit

NOTE: that is just how you defined the stream, you still need to execute it

One note here, I'm not 100% sure why you need multisample here in general, if you just need to mix the multiple inputs together you can just do it right away in the merge operation summing up Samples. Just don't forget to normalize the stream to make sure it fits under [-1, 1] interval to avoid cropping. So the code would look even simpler considering the plus operation is defined over the Stream<Sample>:

(sampleStream1 + sampleStream2 + sampleStream3 + sampleStream4)
   .normalize() // that is a custom function which implementation is omitted here: fun Stream<Sample>.normalize(): Stream<Sample> = .....
   .out { ... }

Let me know if that explains or you need additional information.

If you can direct me to an answer I'm happy to make a PR with an example in the docs. Thanks again

That would be awesome. Thank you!

You may consider contributing to the blog part, if you feel like writing a short article. It is also being published to wavebeans.io website.

gad2103 commented 2 years ago

@asubb thanks for the thorough response! I think you are right in the second half of your reply. To be more specific I'm getting raw wav data (headerless) from some remote service calls. They come back as byte arrays. Then I want to mix them into one byte array.

I will look into the definition of the stream type to see but it sounds like what youre saying is that i can treat each byte array as a stream and then sum and normalize. But then when I run execution can I get back the resulting mixed byte array?

asubb commented 2 years ago

@gad2103 What is the bit depth of the incoming signal? As you may need to decode multiple bytes into one sample (single byte is just 8 bit signal which is quite crappy if we're speaking about usual music-like streams), also the order of the bytes matter (Little Endian / Big Endian). Here you can take a look how the library reads all those bytes: https://github.com/WaveBeans/wavebeans/blob/develop/lib/src/main/kotlin/io/wavebeans/lib/io/WavInput.kt#L116 or more specifically https://github.com/WaveBeans/wavebeans/blob/develop/lib/src/main/kotlin/io/wavebeans/lib/io/ByteArrayLittleEndianDecoder.kt

Or even try to use that ByteArrayInput

As for execution, it might be easier for you if you want to have byte arrays to implement that as execution as a sequence:

(sampleStream1 + sampleStream2 + sampleStream3 + sampleStream4)
   .normalize()
   .asSequence(sampleRate) // that'll be Kotlin Sequence<Sample>

and then serialize Sample to the desired medium, i.e. byte array keeping in mind how you need to convert it to the byte array.

If you tell me what you need to do with the output, I may provide further guidance.

gad2103 commented 2 years ago

Hi @asubb the bit depth can vary. I need to sometimes mix 8-bit with 16-bit, depending on the source.

the use case is that i'm getting generated audio from a text to speech engine (usually in mulaw 8khz 8-bit depth) and mixing other tracks with it like mp3s or pcm_s16le, depending again on the source.

the output goes back to the telephone line, and we use twilio. the format they accept is mulaw 8khz. sorry if i sound like an idiot about this stuff. i'm new to all this audio codec stuff so i'm just getting my bearings.

does that help clarify what i'm trying to do?

gad2103 commented 2 years ago

also, i don't want to read from files since i'm getting the data straight from a service. i'm also running in restricted environments where i don't have reliable file system access, so i'd like to take the byte arrays and convert them directly to streams and then add them like you're demonstrating. Thanks again for being so responsive!

asubb commented 2 years ago

It clarifies a lot, thank you for this!

If you convert your input signal into Samples you wouldn't need to think of the bit depth any more, just create it appropriately, here is more information https://wavebeans.io/docs/api/#sample

To make everything under the same umbrella I would suggest to use the input and output as functions and execute it on a SingleThreadedOverseer, so your main application can perform further actions. Or if you gonna have multiple outputs it's better to run with parallel strategy, so it won't read inputs multiple times. Though you would need to define them as a class not as a lambda function, as you may have parameters and state, and also it'll be easier to unit test them.

Also, it sounds like you have sample at the very same sample rate, so it won't be a problem, but keep in mind there is a function for resampling so you could merge streams with different sample rate, or output one: https://wavebeans.io/docs/api/operations/resample-operation.html

mp3 format is not supported at the moment, but there are a lot of libraries out there I'm pretty sure.