pybricks / support

Pybricks support and general discussion
MIT License
107 stars 6 forks source link

[Feature] Some Speaker methods are not implemented #575

Open lobodpav opened 2 years ago

lobodpav commented 2 years ago

Describe the bug The following speaker functions do not work on Mindstorms Inventor:

  1. play_file
  2. say
  3. set_speech_options
  4. set_volume

Once called, an error is thrown:

AttributeError: 'Speaker' object has no attribute 'set_volume'

To reproduce Run the following program:

from pybricks.hubs import InventorHub

speaker = InventorHub().speaker
speaker.play_file("test")
speaker.say("Hello")
speaker.set_speech_options()
speaker.set_volume(10)

Expected behaviour Not sure how the say function should work. However, I would expect the following to work:

  1. Speaker volume should be adjustable.
  2. In-built sounds should be playable.

Question I assume custom sounds are not yet supported. Am I correct?

laurensvalk commented 2 years ago

We only support the documented methods, so beep and play_notes.

The other methods are not (yet) implemented.

Since our main editor does not have auto complete yet, we haven't prioritized getting the stubs/typing API to be 100% correct, so this is why some of the methods from EV3 (where they are implemented already) are shining through.

lobodpav commented 2 years ago

Could I implement at least the set_volume? Would you guide me on where to start?

dlech commented 2 years ago

In the firmware, the set_volume method needs to be added to pb_type_speaker.c. Compared to the EV3 version, we will leave out the which parameter since the new hubs don't have a true beep mode like EV3.

The volume value should be coerced to a value between 0-100 (pb_obj_get_pct()) and can just be saved to a new field in pb_type_Speaker_obj_t. Then this volume needs to be applied when creating the beep waveform. We also need to take into account the perceived volume vs. amplitude of the waveform which is roughly logarithmic.

So a simple and efficient way to do this might be to quantize the volume level to 12 levels and then bit shift the data to the right this number of bits (the DAC is 12-bits).

lobodpav commented 2 years ago

Hmm, that sounds like a 6 months' project to me 😄 Mainly due to my lack of expertise in the sound-processing area.

dlech commented 2 years ago

It's really not that bad. But I have doubts to how well this method of volume control would work anyway since it essentially lowers the sound quality as it lowers the volume. So it might be best if we just don't implement it.

As for the API for code completion. I would suggest copying the EV3 version of the class to a new _ev3dev module so that it can be used with the EV3 brick class and modify the Speaker class in the _common module to reflect what is actually implemented on Powered Up devices.

lobodpav commented 2 years ago

The Lego Mindstorms App implements volume changing blocks as well as Python code and it makes sense to me - sometimes I want the notes/beeps louder and sometimes quieter to give the Robots some emotions.

I suppose you meant the sound quality lowering for the in-built sounds, right? When it comes to notes/beeps, this is just a simple square wave in the implementation whose quality should not get affected by the amplitude coercion.

Since the Inventor Hub supports beeps/notes only, I would give it a try. Hope I'll manage to flash the amended FW onto my hub 😊

As for the DOCs, that makes perfect sense.

lobodpav commented 2 years ago

Implemented a first attempt to set the volume level in the pb_type_speaker.c 😊

Wondering, is there a way to build the FW on macOS?

dlech commented 2 years ago

From Homebrew you will need to install dfu-util or libusb to make flashing the firmware work. And brew install --cask gcc-arm-bembedded to install the cross-compiler.

And I assume poetry is already installed from your work on pybricksdev. Use it to install the Python packages required for development as in https://github.com/pybricks/pybricks-micropython/blob/master/CONTRIBUTING.md#set-up-the-python-environment. Then be sure to run all of the rest of the commands in the poetry shell.

make mpy-cross -j8
make -C bricks/primehub -j8
# here, turn off the hub, remove the USB cable then press and hold the Bluetooth
# button and insert the USB cable while holding the button. don't let go until the
# Bluetooth light starts flashing red/green/blue
make -C bricks/primehub -j8 deploy-dfu
lobodpav commented 2 years ago

@dlech The firmware build went well, thanks! A few more questions though:

  1. How is the Python API created for the C code? Is it auto-generated?
  2. Is there a way to test my new methods prior to deploying the firmware? Turning the device to DFU and flashing is a bit tedious when done several times in a row.
  3. With a USB cable connected, the Hub can't be turned off. The light gives 2x orange, 1x blue flashes. What does that mean?
  4. I can see the Hub in the macOS USB System Information and also the https://dfu.pybricks.com is able to connect to it. However, the deploy-dfu target failes to connect to the Hub because the bricks/primehub/Makefile refers to DFU_PID = 0x0008. When Listing my USB devices, the hub has idProduct: 0x0011. When changed to DFU_PID = 0x0011, the deployment works.
  5. I'm a bit lost in the C code structure when it comes to code specific to certain devices. Using the speaker as an example, there are two types: pybricks/common/pb_type_speaker.c and bricks/ev3dev/pb_type_ev3dev_speaker.c.
    1. Is the former a common type and the latter is EV3 specific?
    2. The other Lego hubs don't have speakers, do they? I would expect the former Speaker to be in a Prime/inventor specific directory or have the device name in the type name the same way ev3 has (i.e. being called pybricks/prime/pb_type_prime_speaker.c).
    3. The EV3 speaker is in bricks directory whereas the Prime one is in pybricks. Does that have any specific reason?
  6. The waveform_data in te pb_type_speaker.c are of uint16_t type and square wave is generated using 65,535 value. However, you've mentioned that the DAC is 12-bit which limits the range to 0..4095 if I'm not mistaken.
    1. If I got the DAC PCM encoding right, the waveform contains the amplitude of the signal. For volumes between 0..100, we could just set the waveform amplitude of the square wave to 4095 * volume / 100.
    2. You've mentioned shifting the bits to the right using the 12-bit volume scale. Was there any reason for not multiplying the waveform values by volume / 100? This way we would achieve precise volume control.
lobodpav commented 2 years ago

As per the documentation, a first attempt is here: https://github.com/pybricks/pybricks-api/pull/91

dlech commented 2 years ago
  1. How is the Python API created for the C code? Is it auto-generated?

It actually works much like regular Python. A type (class) is basically a lookup table of names (of methods, attributes, etc.). It is defined here. The (non-special method) members of the type are defined here. A MicroPython object wrapper is created for each C function that gets exposed as a method like this. In Pybricks, we pretty much always allow keyword arguments for MicroPython functions, so we have a special macro to define the Python arguments like this.

2. Is there a way to test my new methods prior to deploying the firmware?

If you want to test that a function gives the correct output, you can write a unit test for it (see lib/pbio/test/). But since the speaker requires actual hardware on the device, unit tests can only get you so far.

3. With a USB cable connected, the Hub can't be turned off. The light gives 2x orange, 1x blue flashes. What does that mean?

Flashing orange means low battery. Flashing blue means Bluetooth is advertising waiting for a connection. So both of these are happening at the same time.

4. When Listing my USB devices, the hub has idProduct: 0x0011. When changed to DFU_PID = 0x0011, the deployment works.

I forgot about that part. Instead of modifying the file, you can just add DFU_PID=0x0011 after deploy-dfu in the make command. I was never able to figure out a way to make this happen automatically since we have two different devices that use exactly the same firmware.

  1. Is the former a common type and the latter is EV3 specific?

Correct.

2. The other Lego hubs don't have speakers, do they?

They do not. (Although NXT brick does.)

2. Speaker to be in a Prime/inventor specific directory or have the device name in the type name the same way ev3 has

The EV3 is a special case since it runs Linux instead of running Pybricks on bare metal. The "common" Speaker will be shared with NXT and any other future hubs that have a speaker.

3. The EV3 speaker is in bricks directory whereas the Prime one is in pybricks. Does that have any specific reason?

Because it is a special case.

6. The waveform_data in te pb_type_speaker.c are of uint16_t type and square wave is generated using 65,535 value. However, you've mentioned that the DAC is 12-bit which limits the range to 0..4095 if I'm not mistaken.

The hardware only looks at the 12 most significant bits. The last 4 bits are ignored.

  1. If I got the DAC PCM encoding right, the waveform contains the amplitude of the signal. For volumes between 0..100, we could just set the waveform amplitude of the square wave to 4095 * volume / 100

We could, but as I mentioned, what our ears perceive as loudness is not linearly proportional to amplitude, it is logarithmic. You can test this and see what you think for yourself, but I have a feeling it won't give satisfactory results. But I'm OK with being wrong about this :wink:.

2. You've mentioned shifting the bits to the right using the 12-bit volume scale. Was there any reason for not multiplying the waveform values by volume / 100? This way we would achieve precise volume control.

Since we have to apply this operation to every single sample during playback of an arbitrary sound file, I would prefer to use >> instead of / if we can since division is a computationally expensive operation. For example, we could scale the 0-100 volume level to 0-127 one time and then >> 7 instead of / 127 for each sample.

lobodpav commented 2 years ago

I was never able to figure out a way to make this happen automatically since we have two different devices that use exactly the same firmware.

Wondering, why do we even need to specify the DFU_VID and DFU_PID in the prime/Makefile? The other Hubs do not specify these (except for essentilalhub) and the pydfu.py has a filter for devices in DFU mode anyway. Once I removed the DFU_PID from the Makefile, the deployment worked nicely.

laurensvalk commented 2 years ago

There's no need to specify it if you run:

make -C bricks/primehub -j8 deploy

This will call Pybricksdev, which takes care of all that automatically. It also checks that you have attached the right hub.

In fact, this even works if the hub is not in DFU mode and running the stock firmware.

dlech commented 2 years ago

The hubs without those don't have USB. If we remove the check, then it could be possible to install the wrong firmware on a device.

lobodpav commented 2 years ago

The hubs without those don't have USB. If we remove the check, then it could be possible to install the wrong firmware on a device.

Do you mean a situation where someone would have a cityhub connected over Bluetooth for example, and he would run make -C bricks/primehub -j8 deploy?

lobodpav commented 2 years ago

@dlech Would you mind a little advice? I tried to use sin function from <math.h> in the pb_type_speaker.c and added LDFLAGS += -lm into the Makefile.

The build fails though: arm-none-eabi-ld: cannot find -lm.

dlech commented 2 years ago

We are using the MicroPython implementation of the math library so no -lm is needed.

But, in general, we would like to avoid floating point math if possible.

lobodpav commented 2 years ago

We are using the MicroPython implementation of the math library so no -lm is needed.

But, in general, we would like to avoid floating point math if possible.

Tried to generate sine wave using <math.h>. But without -lm the build fails on unknown sin function 😞

laurensvalk commented 2 years ago

What about sinf?

There's also fix16_sin from the libfixmath library included in our project.

laurensvalk commented 2 years ago

Also, pybricks.common.IMU uses floating point trigonometry. It uses atan2f, for example, and it compiles if using sinf too.

So perhaps you can check/copy what the includes should be to make this work.

laurensvalk commented 2 years ago

If you can, it's probably best to leave comments in place instead of deleting them. Makes the discussion a bit easier to follow :)

You can always edit them if you want to make a correction.

lobodpav commented 2 years ago

Sorry, will update the comments next time rather than deleting them.

I made that comment, and then realised it was my mistake because I had math.h import missing, so I deleted the comment just before you replied 😞.

laurensvalk commented 2 years ago

No worries. We're glad that people like you are building the source code and even making contributions, so we're happy to help :)

lobodpav commented 2 years ago

@dlech Would you mind having a look? The PR is in Draft until I polish it a bit, but so far the testing went great.

I've implemented exponential amplitude adjustment for the square wave (sine wave is commented out), and it works nicely. I.e. the volume setting seems linear from my ears' perspective.

What's interesting is that for Sine wave, linear amplitude adjustments seems much better.

lobodpav commented 2 years ago

The get/set volume is sorted now, only the play_file remains to get implemented. Wondering how to incorporate the sounds official Lego FW has in-built. And if that would be legal.

dlech commented 2 years ago

First, we need proper file system support.

Using sounds from LEGO would most likely be in violation of their EULA.

dlech commented 2 years ago

Getting ahead here, but I just did a quick test to see how expensive it is in terms of CPU time to apply signed to unsigned conversion and volume adjustment to every single sample in the audio data.

int16_t data_in[8000];
uint16_t data_out[8000];

...

        uint32_t start = mp_hal_ticks_us();
        uint16_t att = self->sample_attenuator;
        for (int i = 0; i < 8000; i++) {
            data_out[i] = (data_in[i] - INT16_MIN) * att >> 16;
        }
        uint32_t end = mp_hal_ticks_us();

        return MP_OBJ_NEW_SMALL_INT(end - start);

If our microsecond implementation can be trusted, this takes 918 microseconds. Most audio files should have a sample rate of 48000 Hz or less. So this might take 7ms of CPU time for every 1s of audio, which doesn't seem bad at all.

However, this is over 90% of our 1ms tick time, so when we go to implement this for real, we should probably process the audio data in 1 or 2K chunks.

This means we could just use standard .wav files and not have to have any special file conversion utility. If we allow stereo audio files, there will be slightly more CPU overhead (data_in[i * n_channels] instead of data_in[i]).

lobodpav commented 2 years ago

However, this is over 90% of our 1ms tick time, so when we go to implement this for real, we should probably process the audio data in 1 or 2K chunks.

Did I get it right that by the 1 to 2k chunks you want to decrease the usage per cent of the 1ms tick time? I.e. instead of the 90% utilisation for the 8k chunks in the example, you would achieve 11.25% utilisation for 1k chunks?

If we allow stereo audio files

Does any hub feature stereo speakers? If not, we could play the left channel only, for example.

data_out[i] = (data_in[i] - INT16_MIN) * att >> 16;

A nice analysis, and an effective algorithm, by the way! Took me some while to get it 😊 (I'm not used to low-level number operations on a daily basis)

dlech commented 2 years ago

Did I get it right that by the 1 to 2k chunks you want to decrease the usage per cent of the 1ms tick time?

Yes, we also need CPU cycles to service other time-sensitive things like Bluetooth, motors, etc.

Does any hub feature stereo speakers?

No.

dmusican commented 1 month ago

Just curious: what's the status of implementing the ability to play an actual sound file? It's the #1 FAQ I get from students that has an answer of "pybricks can't do it, sorry!"