MusicPlayerDaemon / MPD

Music Player Daemon
https://www.musicpd.org/
GNU General Public License v2.0
2.19k stars 350 forks source link

selective choosing frequency for resampling #220

Open mistepien opened 6 years ago

mistepien commented 6 years ago

It would be great to resample taking into account multiplication of 44100 and 48000. Eg. sound card supports 44100 48000 88200 96000. If input file is higher than 96000 than it will resampled to 96000, notwithstanding it is 192000 or 352800.

Resampling from 352800 to 88200 is noticeably faster than from 352800 to 96000. It matters in case of slower machines. Thus is suggest solution according to which eg. 352800 is resampled to 44100, 88200 or 176400 (depending what is supported by sound card) and 384000 is resampled to 48000, 96000 or 192000.

RafaPolit commented 6 years ago

I second the request. I would have hopped to get SoX kind of 'built into' this feature and have the following behaviour:

We currently get (in SoX) 16 bit / , 24 bit / and 32 bit / *, which is nice to zero pad a signal and keep the native sample rate.

Ideally, upsampling or downsampling is better within same clocks: 44.1 and 48. So, if I am upsampling a 44.1 file to the 3xx kHz, it is better served with a 352.8 frequency. If I am doing the same for a 48 file, 384 makes more sense.

Could we get 'intelligent clock-wise' modes like: 16 bit : 44.1 / 48 24 bit: 176.4 / 192 32 bit: 352.8 / 384

I read somewhere that this was, precisely, the way SoX handled these scenarios, but I have done some testing in MPD and there is no way to achieve this (with SoX). 384 is 384 no matter the source clock.

Can this be worked in? Thanks, Rafa.

alexbo1 commented 6 years ago

I would be interested in this feature too. Ideally it should be possible to configure resampling for each combination of input and output sampling rate/bits, but I would also be very happy with a simple config option to deactivate sampling rate conversion for DSD material.

At the moment everything is converted to the specified sampling rate/bit resolution, if resampling is activated. For PCM this is generally ok, aside from the abovementioned performance issues. The conversion of DSD to PCM (and vice versa) is more problematic and definitely should not be done if a DAC supports DSD natively, such as my RME ADI2 Pro Fs. I would be able to playback DSD data in its native form, but have 44.1/16 and other low resolution PCM data resampled to >= 192/32, which definitely sounds better - maybe due to possible higher filter cutoff frequencies ...

At the moment I have to reconfigure (turn on/off resampling) und restart mpd to achieve the desired behavior for both DSD and PCM data.

A config option to avoid resampling of DSD would be great. Fine grained control (a control matrix) for sample rate conversion even better.

Thanks,

Alex

MaxKellermann commented 6 years ago

@alexbo1 MPD will never ever resample DSD because MPD has no code which would be able to do that. MPD can however convert DSD to PCM if your output is incapable of DSD.

alexbo1 commented 6 years ago

@MaxKellermann You are right, it is not MPD that does the resampling. libresample or soxr do the job. In order to upsample low res files to 384/32 I have

audio_output_format "384000:32:2"

samplerate_converter "soxr very high"

in my mpd.conf. If these lines are present DSD is also converted/resampled to PCM. If I remove these lines. DSD is played as DSD, but low res PCM stays low res PCM.

There should be additional option for the sample rate conversion to avoid DSD to PCM conversion like

samplerate_converter "soxr very high no_DSD_to_PCM"

and/or separate output format specifications

audio_output_format_pcm "384000:32:2" audio_output_format_dsd "dsd512:2"

MaxKellermann commented 6 years ago

@alexbo1 not correct. I've never heard of libresample, but neither libsamplerate nor soxr can resample DSD. And believe me, MPD will never resample DSD. Not by its own code, and not with any external library. Your point is wrong and off-topic here.

bitkeeper commented 4 years ago

I think only supporting this from Sox only will not help. Also the used audio_output_format needs to be changed, else the audio will still be played atthe original audio_output_format(all due the stream is resampled) . Even better is that when the audio_output_format is change, Sox will use that samplerate so no to Sox are needed.

The format is set in DecoderControl::SetReady by applying configured_audio_format to current song:

out_audio_format = audio_format.WithMask(configured_audio_format);

If the samplerate is set in configured_audio_formatit will overrule the audio_format.sample_rate

Tried a proof-of-concept by changing AudioFormat::ApplyMask (used by WithMask): from:

if (mask.sample_rate != 0)
     sample_rate = mask.sample_rate;

to:

if (mask.sample_rate != 0)
    sample_rate = determine_selective_resample_rate(sample_rate, mask.sample_rate);

The helper function determine_selective_resample_rate is:

 unsigned
 determine_selective_resample_rate(unsigned source_rate, unsigned target_rate) noexcept
 {
    unsigned out_sample_rate = source_rate;
    const std::map<unsigned, unsigned> lut48to41 = {
        {384000, 352800 },
        {192000, 176400 },
         {96000,  88200 },
         {48000,  44100 }
    };  

    if( source_rate % 44100 == 0 && lut48to41.find(target_rate) != lut48to41.end() )
    out_sample_rate = lut48to41.find(target_rate)->second;
    else if(target_rate)
    out_sample_rate = target_rate;
    return out_sample_rate;
}

This helper assumes we always provide the audio_output_format in the x48k based. If the source is x 44k1, a matching x44k1 version will be used. In all other cases it will just use the audio_output_format.

This work out nice, but is hardcoded always on. A workable version needs to be a little bit more flexible. It will require a mpd.conf option like selective_resample. That is just the easy part, only getting it to DecoderControl::SetReady isn't. It is quite nested to reach the constructor of DecoderControl. Easier is to add a member to the AudioFormat to indicate if this instance is used in the mask, it should be used with selective_samplerate.

I more than willing to make a PR for it. @MaxKellermann could give me a direction about:

FabienPochez commented 4 years ago

Hey @bitkeeper Your hardcoded solution looks like a solution to a problem I have. I need to downsample everything to a max of 48k or 44k1 depending on original samplerate to avoid drops on the usb output to my dac and not upsample anything as my dac will do that on it's own. Haven't touched code (except for webdesign) or linux in like twenty years so I'm totally lost.

Is your function to be implemented in mpd.conf directly?

bitkeeper commented 4 years ago

@FabienPochez I have just created a PR wich including configuration by mpd.conf.

In your case if your mpd.conf contains something like:

audio_output_format "48:2:1" 
selective_44k_resample "yes"

A song based on a 44k clock will now be downsampled to 44.1kHz, while the 48k based version will be downsampled to 48kHz.

FabienPochez commented 4 years ago

@bitkeeper sounds great! thanks! So basically now have to wait for mpd to be updated with your PR? And then update and apply settings?

bitkeeper commented 4 years ago

@FabienPochez hold your horses ... if the PR get accepted at all. For the moment you are able to build MPD yourself, you download my branch and try it out.

FabienPochez commented 4 years ago

@bitkeeper sorry, got a little overexcited. I'll try to build mpd from your branch (gonna look into it anyway). Thanks!

bitkeeper commented 4 years ago

To get an feeling about the performance difference I did some test with sox commandline. The difference in used cpu resourcesis used goin from44k1>96 compared to 44k1>88k2.

sox measurement results.txt

Performance measurement environment.

Tests are run on a ramdisk. As test file the default audio from Moode is used. To prevent testing decoder/encoders test files are wave files.

Performed tests:

MaxKellermann commented 4 years ago

Note that the code posted by @bitkeeper does not implement this feature request. It implements a hard-coded table of upsampling frequencies (for the sake of an imaginary quality improvement out of nowhere), but does nothing to match with the DAC's capabilities.

bitkeeper commented 4 years ago

If:

  1. You set audio_out_format to a multiplication of 48kHz like 96000:24:2 (Offcourse you have to make sure the target requested samplerate is supported by your output and/or DAC.).
  2. And selective_44k_resample is set to yes.

Then it will work exactly as requested for multiplications of 44.1kHz and 48kHz. As @mistepien already indicated it is a lot faster. See the results in my previous message above. You can easily reproduce this your self.

And yes the implementation can be more flexibel and even output based, but it does work.

Wang-Yue commented 3 years ago

I think this is a useful feature to add.

Implementing the feature itself is not hard at all. I think the difficult part of handling this issue is to have a better syntax for this feature.

one proposal:

in allowed format, make it support having a table such that

allowed_format "44100:*:*->176400:32:*  48000:*:*->194000:32:* 88200:32:*->176400:32:*  ... " 

and so on. this looks silly, but should be easy to implement, and versatile enough to meet different scenarios.

chaseemory commented 3 years ago

I think a feature like this would be very valuable.

Allowing the user to specify the entire table of inputs and outputs such as Wang-Yue suggested is a good, though somewhat tedious solution.

I also think just allowing people to specify integer only scaling factors and a max sample rate would be fine?