andrewrk / libgroove

streaming audio processing library
http://andrewrk.github.io/libgroove/
MIT License
290 stars 35 forks source link

Use of Libgroove in Volumio? #88

Closed andrewrk closed 9 years ago

andrewrk commented 9 years ago

Hi Andrew, My name is Steven Ning, I'm one of the developers for the Volumio project, a high fidelity audio system for Raspberry Pi and other tiny boxes (such as UDOO, BBB, etc).

Volumio currently uses a custom compiled version of MPD for playback of local audio files. There are a number of idiosyncrasies with MPD, however, and for the next version of Volumio, we're considering other player options such as Libgroove

I'm a fan of Groove Player, and am currently evaluating Libgroove as a replacement for MPD in Volumio 2.0. Is Libgroove able to perform bit perfect playback of high resolution audio files?

Thanks!

andrewrk commented 9 years ago

Hi Steven,

The use case you have outlined is a use case that libgroove wants to solve. It might take a couple patches to meet your "bit perfect" requirement but I'm willing to make that happen.

Let me understand the bit perfect requirement in detail with a few questions:

To explain that second one a bit more, let's say that the hardware playback device supports sample rates 48,000 and 96,000 and sample formats 24-bit integer and 32-bit integer. And now we have these songs queued up:

In the ideal scenario, libgroove would open a playback stream to the hardware with 24-bit integer, 48,000 hz, and play song 1. Then it closes the playback stream and opens a new one with 32-bit integer 96,000 sample rate and does DSP to resample the audio to fit the playback rate of the sound card.

Is that right?

Now, what happens when song 1 and song 2 are from the same album and they are supposed to have gapless playback? Closing the hardware playback stream and opening it again with different parameters would cause an audible glitch.

Would it be acceptable for libgroove to open the hardware playback device with sufficiently high quality parameters such as 32-bit integer and 96,000 Hz sample rate, and then resample all audio to fit that format?

andrewrk commented 9 years ago

Andrew, Awesome, I'd like to help out wherever I can! I don't know as much about audio quality as the other Volumio devs, but am learning as I go. Feel free to enlighten me if something doesn't sound right!

The general goal is to render as precisely as possible the PCM data that an audio file encodes. If this is not possible, due to hardware limitation or user action (such as use of software mixer, replay gain, specified downsampling), Volumio would ideally notify the user. I think that the typical user does not use replaygain, and performs volume control at the hardware level to avoid the quality loss associated with PCM rescaling.

If two songs with different sample rates or bitrates were played back to back, it would be most straightforward to open playback for each track at the format it specifies. For gapless albums, most people rip the tracks in the same format. If two tracks come from different sources and have different formats, however, I'm sure the user would understand if there was a bit of a gap between them.

As far as resampling goes, would it be through SoX or other library, or do you envision a custom algorithm?

Thanks, Steven

andrewrk commented 9 years ago

The general goal is to render as precisely as possible the PCM data that an audio file encodes. If this is not possible, due to hardware limitation or user action (such as use of software mixer, replay gain, specified downsampling), Volumio would ideally notify the user.

This should be no problem. libgroove already exposes the target playback format and the actual playback format.

I think that the typical user does not use replaygain, and performs volume control at the hardware level to avoid the quality loss associated with PCM rescaling.

Let me propose what I think is a better strategy:

The music player applies a "pregain" to the output to make it quieter. This gives us headroom to turn up the volume without affecting quality. Next, we pre-scan all songs to figure out how loud they are. Finally, the music player adjusts this "pregain" volume up or down depending on how loud the song is, so that the user does not find themselves constantly adjusting the volume knob. If a song is very quiet and needs to be turned up past the pregain volume, there are 2 ways to resolve this: 1. use a dynamic compression filter (a big no-no if you're attempting for bit perfect playback) or 2. let the song play at full volume with no distortion, which the user might think is too quiet.

See this video for more details: Loudness Zen

PCM rescaling, especially at a sample format like 32-bit integer, is not something to be concerned about quality loss. However I suppose if the feature is "bit perfect playback" then it's easier to just have bit-for-bit playback than to convince people that actually that feature is kind of silly.

Anyway, that was just kind of a rant, because libgroove does not impose volume scaling on you; it is a feature that you may choose to use, or not, and if you choose not to use it, then the bits of audio are not affected and no extra CPU processing is used.

For gapless albums, most people rip the tracks in the same format.

Fair point.

As far as resampling goes, would it be through SoX or other library, or do you envision a custom algorithm?

Currently, libgroove depends on libavfilter for resampling.

If two songs with different sample rates or bitrates were played back to back, it would be most straightforward to open playback for each track at the format it specifies.

Alright, so here's the scoop with libgroove:

It's divided into 4 libraries, 2 of which are relevant to your use case.

I'm confident that you will want to use libgroove with volumio and that it will suit your needs.

libgroove core provides a playlist interface where you put items on it and receive decoded audio buffers. You can specify that you want unmodified, untouched, un-volume-scaled, buffers.

The question here is whether you want to use libgrooveplayer or not. As is, libgrooveplayer is hard coded to open the hardware playback device with 44,100 hz, 16-bit integer, stereo. Obviously this needs to change for bit perfect playback. In addition, libgrooveplayer only opens the hardware playback stream once and leaves it open during playback. So this needs to be modified to close and open the hardware playback device with the exact correct playback parameters.

The only thing that makes this not completely straightforward is dealing with conditions where the hardware device cannot handle the playback parameters exactly. In this situation, we would have presumably been using the feature where audio buffers are not resampled at all in the audio graph, and so missed their chance to get resampled, and we find out too late that we do in fact need to resample because the audio device supports up to 96,000 and the audio file is 192,000.

This could be potentially solved with adding a core feature where instead of specifying only a single playback format, or specifying that you don't care what playback format it has, you could specify a set of rules to apply. So you encode your hardware device's capabilities into a set of rules and the core libgroove audio graph uses that to figure out whether it needs to do any resampling or not. (hopefully not, except in the case where the hardware device can't handle it).

ning-yu commented 9 years ago

The music player applies a "pregain" to the output to make it quieter. This gives us headroom to turn up the volume without affecting quality. Next, we pre-scan all songs to figure out how loud they are. Finally, the music player adjusts this "pregain" volume up or down depending on how loud the song is, so that the user does not find themselves constantly adjusting the volume knob. If a song is very quiet and needs to be turned up past the pregain volume, there are 2 ways to resolve this: 1. use a dynamic compression filter (a big no-no if you're attempting for bit perfect playback) or 2. let the song play at full volume with no distortion, which the user might think is too quiet.

Yes, I'm not opposed to the principle of automatic volume adjustment. However, in Volumio, there are two things we would have to fight:

I think it is fair to offer the option for replaygain scanning on local files or libgroove-rendered streams, but for the above reasons we would probably ship with it turned off by default.

The question here is whether you want to use libgrooveplayer or not. As is, libgrooveplayer is hard coded to open the hardware playback device with 44,100 hz, 16-bit integer, stereo. Obviously this needs to change for bit perfect playback. In addition, libgrooveplayer only opens the hardware playback stream once and leaves it open during playback. So this needs to be modified to close and open the hardware playback device with the exact correct playback parameters.

The only thing that makes this not completely straightforward is dealing with conditions where the hardware device cannot handle the playback parameters exactly. In this situation, we would have presumably been using the feature where audio buffers are not resampled at all in the audio graph, and so missed their chance to get resampled, and we find out too late that we do in fact need to resample because the audio device supports up to 96,000 and the audio file is 192,000.

This could be potentially solved with adding a core feature where instead of specifying only a single playback format, or specifying that you don't care what playback format it has, you could specify a set of rules to apply. So you encode your hardware device's capabilities into a set of rules and the core libgroove audio graph uses that to figure out whether it needs to do any resampling or not. (hopefully not, except in the case where the hardware device can't handle it).

We're coding the new Volumio as a Node.js application, and would probably hook to libgroove through node-groove. When I call groove.createPlayer(), does this create an instance of libgrooveplayer?

If so, it would be great to allow for adjustable output format. I'm not sure yet what that entails, but plan on familiarizing myself with the libgroove code this week. We could definitely put something in Volumio to enumerate the capabilities of the sound card the user has selected to play through.

andrewrk commented 9 years ago

We're coding the new Volumio as a Node.js application, and would probably hook to libgroove through node-groove. When I call groove.createPlayer(), does this create an instance of libgrooveplayer?

Yes, that's right.

So I think I understand the use case well enough and I'll see what I can do to get libgroove up to spec on this over the next month.

ning-yu commented 9 years ago

Hey Andrew, we just made Volumio2 repo public. The code currently uses MPD as the local file player, but let me know when we can start testing out Libgroove for high res files!

andrewrk commented 9 years ago

Will do. Thanks for the link.

andrewrk commented 9 years ago

How does volumio want to handle channel layouts? Example:

andrewrk commented 9 years ago

One more question: do you have a way of testing if bit perfect playback is working? Eventually I'd like to replace the SDL dependency on audio playback with my own code, but for now it would be nice to double check that SDL is not doing any resampling, channel mapping, or sample format conversion.

ning-yu commented 9 years ago

How does volumio want to handle channel layouts?

Hmm I imagine different users might want to do it differently. Is there a way to make a general syntax for channel mapping, and make it changeable in a config file?

do you have a way of testing if bit perfect playback is working?

I wish I had a way to do this in hardware. Some of our members report that they have DACs that indicate the format of the audio it is receiving. As far as software determination of audio format, there are a few options. This page reports that you can read the contents of /proc/asound/card0/pcm0p/sub0/hw_params. Volumio 1.X seems to read /proc/asound/card0/stream0 instead. There is also some external software I have not tested, such as alsa-capabilities, which reads the same files, but outputs in a more user friendly display.

As for checking that the output data content is the same as the input, I don't have a good answer at the moment! The best I can think of is to limit the number of things in the audio pipeline (such as using only ALSA), and making sure it is configured not to resample.

andrewrk commented 9 years ago

Alright:

Here's how to use the new API. In this commit diff look at example/playlist.js: https://github.com/andrewrk/node-groove/commit/41d39632823b8e002ef469a14d4d1e6ef4f27129#diff-fb02597f5645c7f2a9285e856f76cac9R11

Here's how it works:

The sound device is opened with default parameters. Pristine audio data is streamed from libgroove core. If the audio data does not match the open sound device parameters, then the sound device is closed and then re-opened with the correct parameters. Care is taken to wait for the existing buffered audio data to flush through the device before closing it and re-opening it.

I believe this issue is solved now. Please open fresh issues if you encounter anything else.