Recording/Getting samples: interaction with CODAL for sound recording

jaustin commented 3 years ago

...transferring from the past private Repo from when V2 was still not announced.... @jaustin commented on Mon Sep 21 2020

We'd like a few Python code examples of how the user could record samples from the microphone and then play them back in the style of a parrot (eg listen until the buffer is full, or there's a silence)

@dpgeorge was going to have a go at putting forward a few different sample models and @microbit-giles and @microbit-carlos could we please also contribute here things from our API prototyping?

@finneyj commented on Mon Sep 21 2020

Yep - sounds like a good probe.

@dpgeorge commented on Thu Sep 24 2020

Example 1: simple wait, record, stop, play.

from microbit import sleep, microphone, audio

sample_buf = bytearray(16000)
record_rate = 8000  # will record max 2 seconds
play_rate = 10000  # play a bit faster to warp the voice a bit

while True:
    # wait until sound is detected (we may miss recording the first bit...)
    while microphone.current_sound() != microphone.LOUD:
        sleep(1)

    # record into the given buffer, in the background
    microphone.record(sample_buf, rate=record_rate, wait=False)

    # record at least 100ms of sound
    sleep(100)

    # wait until it's quiet (recording will stop if it runs out of buffer space)
    while microphone.current_sound() != microphone.QUIET:
        sleep(1)

    # stop recording if it's still ongoing, and get total number of samples recorded
    num_samples = microphone.stop()

    # play samples (in the foreground, ie blocking until it's all played)
    audio.play(sample_buf, rate=play_rate, samples=num_samples)

Example 2: recording via the audio object instead of microphone (to support an external mic?), and the record function has the ability to stop on a given event.

from microbit import sleep, microphone, audio

sample_buf = bytearray(16000)
record_rate = 8000  # will record max 2 seconds
play_rate = 10000  # play a bit faster to warp the voice a bit

while True:
    # wait until sound is detected (we may miss recording the first bit...)
    while microphone.current_sound() != microphone.LOUD:
        sleep(1)

    # record the sound until the given event, or the buffer is full
    num_samples = audio.record_until(sample_buf, microphone.QUIET, rate=record_rate)

    # play back the samples
    audio.play(sample_buf, rate=play_rate, samples=num_samples)

Example 3: attempt to keep some samples before the loud event so the first part of the recording is not cut off.

from microbit import sleep, microphone, audio

sample_buf = bytearray(16000)
record_rate = 8000  # will record max 2 seconds
play_rate = 10000  # play a bit faster to warp the voice a bit

while True:
    # start pre-recording into the buffer in a loop, in the background
    microphone.prerecord(sample_buf, rate=record_rate)

    # wait until sound is detected (while we are recording)
    while microphone.current_sound() != microphone.LOUD:
        sleep(1)

    # switch from pre-recording to normal recording, but keeping the first 100ms of samples
    microphone.record(keep=record_rate * 0.1)

    # record at least 100ms of extra sound
    sleep(100)

    # wait until it's quiet (recording will stop if it runs out of buffer space)
    while microphone.current_sound() != microphone.QUIET:
        sleep(1)

    # stop recording if it's still ongoing, and get total number of samples recorded
    num_samples = microphone.stop()

    # play samples (in the foreground, ie blocking until it's all played)
    audio.play(sample_buf, rate=play_rate, samples=num_samples)

@dpgeorge commented on Thu Sep 24 2020

Example 4: Removing the concept of sample rate and samples.

from microbit import sleep, microphone, audio

sample_buf = audio.RecordingBuffer(2.0)

while True:
    # wait until sound is detected
    while microphone.current_sound() != microphone.LOUD:
        sleep(1)

    # record into the given buffer, in the background, keeping 100ms of past recording
    microphone.record_into(sample_buf, past_recording=0.1, wait=False)

    # record at least 100ms of sound
    sleep(100)

    # wait until it's quiet (recording will stop if it runs out of buffer space)
    while microphone.current_sound() != microphone.QUIET:
        sleep(1)

    # stop recording if it's still ongoing, and zero out any remaining sample bytes
    microphone.stop()

    # play samples (in the foreground, ie blocking until it's all played)
    audio.play(sample_buf, rate_mult=1.2)

Example 5: allocating the recording buffer automatically.

from microbit import sleep, microphone, audio

while True:
    # wait until sound is detected
    while microphone.current_sound() != microphone.LOUD:
        sleep(1)

    # record into the given buffer, in the background, keeping 100ms of past recording
    sample_buf = microphone.record_for(2.0, past_recording=0.1)

    # wait until it's quiet before playing
    while microphone.current_sound() != microphone.QUIET:
        sleep(1)

    audio.apply_effect(sample_buf, audio.EchoEffect(delay=0.1, volume=0.8))

    # play samples (in the foreground, ie blocking until it's all played)
    audio.play(sample_buf, effect=audio.EchoEffect(delay=0.1, volume=0.8))

AllegroGavin commented 3 years ago

These are awesome!

microbit-giles commented 3 years ago

Lots to digest here! My preference is always to keep things as simple as possible, but my audio background is pulling me to want to keep the concept of sample rate, if not the samples.

Do we know if altering the sample rate will produce clearly audible differences that are still useful / intelligible in some way? The ability to change sample rate is only useful if you can hear a difference and learn that you can make a trade-off between recording length and quality.

dpgeorge commented 3 years ago

keep the concept of sample rate, if not the samples

I agree, it's probably going to be necessary to adjust the recording sample rate in certain cases. Building on example 4 from above, this could be done via optional arguments to construct a RecordingBuffer:

sample_buf = audio.RecordingBuffer(duration=2.0, rate=8000)

microbit-carlos commented 1 year ago

Based on the multiple options discussed above I think there are a couple of things I'd like to highlight:

I'd probably avoid adding new arguments to existing APIs that are only applicable to playback and recording
- i.e. Adding arguments to audio.play(...) that would not work with other sound types, like MicrobitSound, MicrobitSoundEffect or AudioFrame.
- This is to slightly reduce the complexity of internally checking what arguments are passed to a function, but most importantly to avoid having to either fail silently or having to throw an exception
  - So, it's better to reduce the ways a users can make mistakes
If 1. is applied, then we would have to either:
1. Create a new API to play the recording, so that these additional arguments are always valid
2. Pack the data with the buffer, so that audio.play() can access things like sampling rate from the buffer class instead of additional arguments
  - I prefer option ii. as it continues to unify the usage of audio.play()
To begin I wouldn't worry to much about pre-caching some of the audio buffer before the recording "officially" starts. We can work out the right API first and then add that later if testing shows it's necessary

First proposal: A new buffer type

With that in mind, I think I would propose to create a new buffer class. We can call it AudioBuffer for now, the name is open for suggestion, but I don't think it should be tied to playback and recording specifically, as it could also be filled as an array/list and contain additional parameters, like sampling rate.

my_buffer = audio.AudioBuffer(samples=16000, rate=8000)
microphone.record(my_buffer)
audio.play(my_buffer)

To modify the playback sampling rate, we could modify its attribute:

# Play at half the speed
my_buffer.rate = my_buffer.rate // 2
audio.play(my_buffer)

A couple of questions:

Should we always require the user to create a buffer first, or should we also allow microphone.record() to return a newly created buffer?
```
my_buffer = audio.AudioBuffer(samples=16000, rate=8000)
microphone.record(my_buffer)
# Vs
my_buffer = microphone.record(samples=16000, rate=8000)
```
- We could support both options, but it's a bit odd having the duplication of arguments for both the AudioBuffer and microphone.record()

Should we use "time" instead "samples"?

my_buffer = audio.AudioBuffer(ms=2000, rate=8000)
other_buffer = microphone.record(duration=2000, rate=8000)

Would that be better as duration? ms? something else?

Second proposal: microphone.record() to be able to both create or use an AudioBuffer

Setting a duration value for a recording makes sense, but it's probably a less common unit type for a buffer (specially if the data can be get or set like an array). One option could be to let the AudioBuffer to only deal with the number of samples and the rate, while having the option to record for a specific amount of time via microphone.record().

# Returns a new buffer for 2 seconds at 8k sampling rate, so 16k samples
my_buffer = microphone.record(duration=2000, rate=8000)
# By having a reasonable default sampling rate this two-liner works well
my_buffer = microphone.record(duration=2000)
audio.play(my_buffer)

# The user can still create their buffer first and record until the buffer is full
other_buffer = audio.AudioBuffer(samples=16000, rate=8000)
same_buffer = microphone.record(other_buffer)
# record() should return the reference to other_buffer, which is unnecessary, but that way it matches the return signature from the previous example 

# An advantage of creating your own buffer is that you can also record only a portion of the buffer
other_buffer = audio.AudioBuffer(samples=16000, rate=8000)
microphone.record(other_buffer, duration=500)
# Or if the argument would be called `ms` instead of `duration`
microphone.record(other_buffer, ms=500)

# And changing the sampling rate at the point where we record can still be valid
microphone.record(my_buffer, duration=8000, rate=2000)
# We don't even need to know how much time would fit, we could record until full with a lower sampling rate
microphone.record(my_buffer, rate=my_buffer.rate // 2)

And being able to record only a portion of the buffer brings me to the third proposal.

Third proposal: microphone.record() to be able to record at a buffer offset

This proposal is about having an offset parameter in microphone.record() method so that a single buffer could be filled via multiple recordings.

This is not necessarily tied to using the duration/ms argument type of microphone.record(), but for continuity the snippets here will use proposal.

For this to work effectively we need a way to keep track where in the buffer the last recording stopped. Normally this could be the return value from microphone.record(), but if we'd like this method to be able to create and return a new buffer, then we'd have to either return a tuple, or find a different way.

The AudioBuffer class could track its index (or audio.play() could update this value), so that the next recording could continue where the previous left over. The default value for offset should be zero, as this is an advance feature and simpler programmes that constantly overwrite a single buffer (for example, press A to record, and B to playback) should work as expected with the default values. Not 100% sure what the "continue where you left off" value should be, offset=None would be the simplest option, but it's not intuitive for this use-case, maybe offset="auto"? Very open for suggestions.

my_buffer = microphone.record(samples=16000, rate=8000)

# We could start a new recording specific  we left off or select it
microphone.record(my_buffer, duration=500)
microphone.record(my_buffer, duration=500)                 # Overwrites the first 500ms again
microphone.record(my_buffer, duration=500, offset="auto")  # Continues where the previous left off
microphone.record(my_buffer, duration=1000, offset=8000)   # The exact sample could be inputted directly

microbit-carlos commented 1 year ago

A few quick points discussed in a call that I will expand next week:

If it's possible to change the playback sampling rate on the fly for everything played by audio.play() we can then add that as function parameter, and a audio.set_rate() function
We can drop the microphone.record() offset argument, as being able to create multiple AudioBuffers fulfils the same purpose
Having microphone.record() return a newly created buffer is a good idea, but instead of having the same function take a buffer or created, we can also have a microphone.record_into(my_buffer) function
If we keep the AudioBuffer type, then having matching arguments for the buffer and the microphone.record() keeps things simple and symmetrical. We can also add a duration parameter to microphone.record().

microbit-carlos commented 1 month ago

We can close this issue as specific discussion about implementation have been carried out in other issues/PRs.

microbit-foundation / micropython-microbit-v2

Recording/Getting samples: interaction with CODAL for sound recording #49