alsa-project / alsa-plugins

The Advanced Linux Sound Architecture (ALSA) - plugins
GNU Lesser General Public License v2.1
41 stars 28 forks source link

New lavc api #23

Closed jamrial closed 3 years ago

jamrial commented 3 years ago

This is untested (I don't have access to a system with alsa, but the change was straightforward enough that i could write it).

It is unlikely that it will fix the silence issue described in #22, but it nonetheless is a change required in order to support future versions of ffmpeg. The silence is, as far as i could tell, related to the fact that starting with ffmpeg 4.4, the ac3_fixed encoder takes S32 planar as input, versus the S16 planar it used to. This plugin should look at the AVCodec sample_fmts[0] value and feed the encoder the appropriate data.

quequotion commented 3 years ago

Is this still relying on libavresample, which was deprecated from ffmpeg?

manio commented 3 years ago

@quequotion from the lavrate.txt: "The plugin in rate-lavr subdirectory is an external rate converter using libavresample library", I don't see any references to it in the pcm_a52 though.

manio commented 3 years ago

@jamrial Thank you again for your work on this. Unfortunately I don't see much feedback/active development recently on this plugin code, so I am still trying to fix it myself... :( I was able to get "some kind of" sound from the plugin as an AC3 stream so I hope I am on good track :)

I am only thinking that even with not exactly matched input type for the encoder some sound has to be produced... I did my changes on top of yours but my problem is that I don't understand one thing... maybe you can help me with this:

In your new _doencode() before _avcodec_sendframe() (https://github.com/jamrial/alsa-plugins/commit/3aec2b22565f1cf03bb99f93763d0f04d9476cf4#diff-96ce76abb22778de0d96ef94d576d3eb582a74c32dfe4f7ad8f560444b04f20eR120) I need to explicit add this code:

  frame->format = rec->avctx->sample_fmt;
  frame->channel_layout = rec->avctx->channel_layout;
  uint16_t *samples = (uint16_t*)frame->data[0];
  ret = av_frame_get_buffer(frame, 0);
  if (ret < 0) {
        return -EINVAL;
  }
  memcpy(frame->data[0], samples, rec->outbuf_size-8);
  //optional av_frame_free() after an encode call...

otherwise I've got EINVAL from that function (using "original" rec->frame). Do you have a clue why is passing a frame which was previously working in _avcodec_encodeaudio2 has to be allocated again and memcopied before this new API call? I don't see _av_frame_getbuffer() in the whole file, instead I think it is used _av_samplesalloc() in _alloc_inputbuffer(), is it ok for this new API?

ps. I know that above code is ugly and not complete - it is rather a playground to understand some things and at least something is audible from the speakers...

jamrial commented 3 years ago

The reason it fails is probably because the send/receive API attempts to create a new reference to the frame internally, whereas the old API would not and simply pass it as is to the underlying encoder, and creating a new reference requires certain AVFrame fields to be set. In this case, format and channel_layout were missing.

Using av_samples_alloc() is fine even for the new API, it will just result in some extra data copy when creating the new reference.

I set the two relevant fields in alloc_input_buffer() and pushed the changes. See if that helps.

manio commented 3 years ago

Thanks, yeah it helps in this matter that I don't need to set those extra data in _doencode() and my minimal patch to get some distorted AC3 sound shrink to:

diff --git a/a52/pcm_a52.c b/a52/pcm_a52.c
index 8f1a654..1403fbd 100644
--- a/a52/pcm_a52.c
+++ b/a52/pcm_a52.c
@@ -118,6 +118,13 @@ static int do_encode(struct a52_ctx *rec)
        AVPacket *pkt = rec->pkt;
        int ret;

+       uint16_t *samples = (uint16_t*)rec->frame->data[0];
+       ret = av_frame_get_buffer(rec->frame, 0);
+       if (ret < 0) {
+               return -EINVAL;
+       }
+       memcpy(rec->frame->data[0], samples, rec->outbuf_size-8);
+
        ret = avcodec_send_frame(rec->avctx, rec->frame);
        if (ret < 0)
                return -EINVAL;

Without this - I've got EINVAL from encoding, so it still needs more work...

jamrial commented 3 years ago

Maybe now?

quequotion commented 3 years ago

When trying to play audio through an a52 device using pulseaudio, this happens more times per second than I can count:

I: [alsa-sink-] alsa-sink.c: Trying resume...
D: [alsa-sink-] alsa-util.c: Maximum hw buffer size is 177984 ms
D: [alsa-sink-] alsa-util.c: Set buffer size first (to 4608 samples), period size second (to 1536 samples).
D: [alsa-sink-] alsa-sink.c: hwbuf_unused=0
D: [alsa-sink-] alsa-sink.c: setting avail_min=1
I: [alsa-sink-] alsa-sink.c: Disabling rewind_within_thread for device a52:0
I: [alsa-sink-] alsa-sink.c: Resumed successfully...
D: [alsa-sink-] alsa-sink.c: snd_pcm_mmap_commit: Invalid argument
E: [alsa-sink-] alsa-sink.c: snd_pcm_mmap_commit: Invalid argument, trying to restart PCM

speaker-test isn't happy with it either (pulseaudio disabled):

speaker-test -c6 -D a52

speaker-test 1.2.4

Playback device is a52
Stream parameters are 48000Hz, S16_LE, 6 channels
Using 16 octaves of pink noise
Rate set to 48000Hz (requested 48000Hz)
Buffer size range from 3072 to 8543232
Period size range from 1536 to 1536
Using max buffer size 8543232
Periods = 4
was set period_size = 1536
was set buffer_size = 8543232
 0 - Front Left
Write error: -22,Invalid argument
xrun_recovery failed: -22,Invalid argument
Transfer failed: Invalid argument
jamrial commented 3 years ago

Unless avcodec_send_frame() or avcodec_receive_packet() are the ones returning an error code, i can't help you. This PR is only trying to port the plugin to the new encode API to be used with ffmpeg versions 4.4 and newer.

Someone familiar with ALSA should adapt this plugin to pass S32 planar audio to the encoder when it expects it, for starters. Otherwise you're not going to get proper AC3 frames out of it.

manio commented 3 years ago

@jamrial regarding your last commit with extended_data - no change... but of course got your points about it... big thanks anyway for this :)

@quequotion use latest code from this pull request, add my patch and try like this:

wget http://www-mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/Samples/Microsoft/6_Channel_ID.wav
aplay -D a52 -f S16_LE -c6 ./6_Channel_ID.wav
jamrial commented 3 years ago

@manio What function is failing, avcodec_send_frame or avcodec_receive_packet? And what error code does it return? (Not talking about the hardcoded -EINVAL i made do_encode() propagate)

manio commented 3 years ago

@jamrial error is from _avcodec_sendframe() it is returning 0xffffffea

jamrial commented 3 years ago

Can you try again with the latest addition?

manio commented 3 years ago

@jamrial You did it!!! :D Now avcodec_send_frame and avcodec_receive_packet are not returning error and I can hear AC3 audio without any additional code changes. It is still not correct but I know this is another story (S16 instead of S32). It even plays on all channels on my test wave file - on my code only one channel was playing :) Big thanks!

manio commented 3 years ago

@perexg Hi, can you please merge this? Those are necessary API changes to make the a52 plugin compatible with newest ffmpeg 4.4. It is not a complete changes to have a correct sound but a fixes for this should be made on top of this anyway...

quequotion commented 3 years ago

Do we have another issue/feature request for the plugin to use S32 instead of S16 yet?

Edit: Not specific to a52, but this looks like it.

We should get that filed and start working on that too.

Thanks for your hard work @jamrial & @manio!

Edit: SwiftKey is the most evil creation in the history of keyboards; link fixed.

perexg commented 3 years ago

Could you elaborate the S32 / S16 issue ? The input format is selected here: https://github.com/alsa-project/alsa-plugins/blob/2352e4e8df7a31cb8a446ca0a32cdb8c5a56a017/a52/pcm_a52.c#L937

EDIT: Sorry, bad line. It's output format.

EDIT2: The input format constraint: https://github.com/alsa-project/alsa-plugins/blob/2352e4e8df7a31cb8a446ca0a32cdb8c5a56a017/a52/pcm_a52.c#L776

jamrial commented 3 years ago

Could you elaborate the S32 / S16 issue ? The input format is selected here:

https://github.com/alsa-project/alsa-plugins/blob/2352e4e8df7a31cb8a446ca0a32cdb8c5a56a017/a52/pcm_a52.c#L937

EDIT: Sorry, bad line. It's output format.

The ac3_fixed encoder in ffmpeg 4.4 and newer takes S32 planar audio as input, but this plugin exclusively handles S16 audio, as seen in the line you linked. In https://github.com/alsa-project/alsa-plugins/blob/2352e4e8df7a31cb8a446ca0a32cdb8c5a56a017/a52/pcm_a52.c#L984 the plugin sets rec->av_format to what the encoder reports as supported in rec->codec->sample_fmts[0], which used to be AV_SAMPLE_FMT_S16P and became AV_SAMPLE_FMT_S32P in newer ffmpeg releases, but ultimately still feeds the encoder S16 audio.

If you can't get S32 audio from ALSA to pass to the encoder for versions where it expects it, you could use a resampler like libswresample.

perexg commented 3 years ago

If you can't get S32 audio from ALSA to pass to the encoder for versions where it expects it, you could use a resampler like libswresample.

Just set the format constrains to S32 according the codec->sample_fmts[0] in https://github.com/alsa-project/alsa-plugins/blob/2352e4e8df7a31cb8a446ca0a32cdb8c5a56a017/a52/pcm_a52.c#L776 . The application will see that the S32 is only accepted and the plug: automatic ALSA plugin can be used for the internal format conversion.

If the av codec accepts multiple formats, pick the more common S16 by default.

jamrial commented 3 years ago

I'll leave that to you or someone else familiar with ALSA that can also test such changes. This MR is to ensure the plugin can work with the new encode public API, regardless of what the underlying encoder handles.

perexg commented 3 years ago

I'll leave that to you or someone else familiar with ALSA that can also test such changes. This MR is to ensure the plugin can work with the new encode public API, regardless of what the underlying encoder handles.

It seems that also fill_data() function should be updated to handle S32.

So the recent ffmpeg supports only S32P for the AC3 encoder?

jamrial commented 3 years ago

Correct. And for that matter, this plugin also attempts to use the "ac3" encoder as fallback if "ac3_fixed" is not available, which will run into a similar issue on any ffmpeg version given it only supports FLTP audio. I guess nobody noticed or reported it since it's very rare for an ffmpeg build to lack the ac3_fixed encoder, so it's practically never used.

perexg commented 3 years ago

@quequotion : Could you test https://github.com/perexg/alsa-plugins/commit/86b3dcfabadd9c72c3f76c31dd9c66d1b561aa72 on top of @jamrial changes ?

My devel branch is here: https://github.com/perexg/alsa-plugins/tree/a52

quequotion commented 3 years ago

Not much better luck, although I see the "Slave: A52 Output Plugin" has switched to S32_LE. aplay still produces an awful grinding sound in all channels, although it's a slightly different tone than before.

aplay -D plug:a52 -v www_lynnemusic_com_surround_test.ac3
Playing raw data 'www_lynnemusic_com_surround_test.ac3' : Signed 16 bit Little Endian, Rate 8000 Hz, Mono
Plug PCM: Rate conversion PCM (48000, sformat=S16_LE)
Converter: libspeex (external)
Protocol version: 10002
Its setup is:
  stream       : PLAYBACK
  access       : RW_INTERLEAVED
  format       : S16_LE
  subformat    : STD
  channels     : 1
  rate         : 8000
  exact rate   : 8000 (8000/1)
  msbits       : 16
  buffer_size  : 3968
  period_size  : 128
  period_time  : 16000
  tstamp_mode  : NONE
  tstamp_type  : GETTIMEOFDAY
  period_step  : 1
  avail_min    : 128
  period_event : 0
  start_threshold  : 3968
  stop_threshold   : 3968
  silence_threshold: 0
  silence_size : 0
  boundary     : 1116892707587883008
Slave: Route conversion PCM (sformat=S32_LE)
  Transformation table:
    0 <- 0
    1 <- 0
    2 <- 0
    3 <- 0
    4 <- 0
    5 <- 0
Its setup is:
  stream       : PLAYBACK
  access       : MMAP_INTERLEAVED
  format       : S16_LE
  subformat    : STD
  channels     : 1
  rate         : 48000
  exact rate   : 48000 (48000/1)
  msbits       : 16
  buffer_size  : 23808
  period_size  : 768
  period_time  : 16000
  tstamp_mode  : NONE
  tstamp_type  : GETTIMEOFDAY
  period_step  : 1
  avail_min    : 768
  period_event : 0
  start_threshold  : 23808
  stop_threshold   : 23808
  silence_threshold: 0
  silence_size : 0
  boundary     : 6701356245527298048
Slave: A52 Output Plugin
Its setup is:
  stream       : PLAYBACK
  access       : MMAP_NONINTERLEAVED
  format       : S32_LE
  subformat    : STD
  channels     : 6
  rate         : 48000
  exact rate   : 48000 (48000/1)
  msbits       : 32
  buffer_size  : 23808
  period_size  : 768
  period_time  : 16000
  tstamp_mode  : NONE
  tstamp_type  : GETTIMEOFDAY
  period_step  : 1
  avail_min    : 768
  period_event : 0
  start_threshold  : 23808
  stop_threshold   : 23808
  silence_threshold: 0
  silence_size : 0
  boundary     : 6701356245527298048

Also, pulseaudio will not start and fails with this error:

E: [alsa-sink-a52:0] lfe-filter.c: Code should not be reached at ../pulseaudio/src/pulsecore/filter/lfe-filter.c:106, function process_block(). Aborting.
manio commented 3 years ago

@perexg FWIW i did a pure-alsa test with aplay: before it was doing an output, currently it can't:

aplay -D a52encode -c6 ./6_Channel_ID.wav
Playing WAVE './6_Channel_ID.wav' : Signed 16 bit Little Endian, Rate 44100 Hz, Channels 6
aplay: set_params:1343: Sample format non available
Available formats:
- S32_LE

I also did the speaker-test:

speaker-test -c6 -D a52encode

speaker-test 1.2.4

Playback device is a52encode
Stream parameters are 48000Hz, S16_LE, 6 channels
Using 16 octaves of pink noise
Sample format not available for playback: Invalid argument
Setting of hwparams failed: Invalid argument

When I set the format then It prints:

speaker-test -c6 -D a52encode -F S32_LE

speaker-test 1.2.4

Playback device is a52encode
Stream parameters are 48000Hz, S32_LE, 6 channels
Using 16 octaves of pink noise
Rate set to 48000Hz (requested 48000Hz)
Buffer size range from 1536 to 4271616
Period size range from 768 to 768
Using max buffer size 4271616
Periods = 4
was set period_size = 768
was set buffer_size = 4271616
 0 - Front Left

and I had to kill speaker-test with -9

manio commented 3 years ago

btw: if you want to hear in what way the sound is distorted, i recorded it :) original file: http://www-mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/Samples/Microsoft/6_Channel_ID.wav what we hear: https://skyboo.net/temp/a52/Voice%20001_sd.m4a

perexg commented 3 years ago

New version: https://github.com/perexg/alsa-plugins/commit/e2b5642504ae3662c18339d3747963d838d6b19a

And it's correct that the a52 plugin accepts only S32_LE format with the recent ffmpeg now.

manio commented 3 years ago

I've tested your new version and I also finally run it also as you suggested earlier using plug. The results: Using plug it is starting to play with aplay, the sound is better but not ideal... The whole file: https://skyboo.net/temp/a52/Voice%20002_sd.m4a It's hard to hear the problem, so I moved the mic near the speaker and recorded again (notice the front left voice "quality"): https://skyboo.net/temp/a52/Voice%20003_sd.m4a

It was tested with command: aplay -D plug:a52encode -c6 ./6_Channel_ID.wav

Then I moved to speaker-test

$ speaker-test -c6 -D a52encode

speaker-test 1.2.4

Playback device is a52encode
Stream parameters are 48000Hz, S16_LE, 6 channels
Using 16 octaves of pink noise
Sample format not available for playback: Invalid argument
Setting of hwparams failed: Invalid argument

Then I added plug:

$ speaker-test -c6 -D plug:a52encode

speaker-test 1.2.4

Playback device is plug:a52encode
Stream parameters are 48000Hz, S16_LE, 6 channels
Using 16 octaves of pink noise
Rate set to 48000Hz (requested 48000Hz)
Buffer size range from 1536 to 4271616
Period size range from 768 to 768
Using max buffer size 4271616
Periods = 4
was set period_size = 768
was set buffer_size = 4271616
 0 - Front Left
^Czsh: killed

it started with no sound and freeze, so i had to kill it

perexg commented 3 years ago

Another attempt is in https://github.com/perexg/alsa-plugins/commit/9868bfa879ffac9c858a5359c6015a37f548f09a .

manio commented 3 years ago

@perexg tested and I didn't notice any difference (I can still hear the "hoarseness" in the voice) with plug:a52encode

jamrial commented 3 years ago

If i'm reading this right, it looks like S16 is still being hardcoded in rec->format, in SND_PCM_PLUGIN_DEFINE_FUNC(a52), which is then used in convert_data() and passed to snd_pcm_hw_params_set_format().

perexg commented 3 years ago

If i'm reading this right, it looks like S16 is still being hardcoded in rec->format, in SND_PCM_PLUGIN_DEFINE_FUNC(a52), which is then used in convert_data() and passed to snd_pcm_hw_params_set_format().

It's output format (for the packed A52 data), so it should be ok.

jamrial commented 3 years ago

Maybe this memset() https://github.com/perexg/alsa-plugins/blob/9868bfa879ffac9c858a5359c6015a37f548f09a/a52/pcm_a52.c#L227

perexg commented 3 years ago

Maybe this memset() https://github.com/perexg/alsa-plugins/blob/9868bfa879ffac9c858a5359c6015a37f548f09a/a52/pcm_a52.c#L227

Yes, it's bad, but this code is only used for drain (at the end of playback). It's fixed now.

Speaker-test fix: https://github.com/perexg/alsa-plugins/commit/4f5903b0641fe46dc45efc91291d89ea30f53b9b Latest version for the S32 support: https://github.com/perexg/alsa-plugins/commit/9bcd762b7807c8092310ca324123f806126f6884

I found another missing piece in the code (missing rec->filled add for use_planar() code in fill_data()).

manio commented 3 years ago

Thank you guys! I can see a big progress :) I can confirm that indeed the speaker test is now working! I don't need to kill it and it is iterating over all channels in a loop but the sound from speaker-test and from aplay is still not 100% clear let me know if you need a recording for this...

tiwai commented 3 years ago

Sorry for being very late to the game, as I've been too busy for other tasks. And thanks for Jaroslav for taking care of this!

I took a quick glance, checked the code, and it looks mostly good, but there are a few small bugs. I've put my version on top of Jaroslav's latest version in https://github.com/tiwai/alsa-plugins Please check it out. If this works out, feel free to fold into your commit for the final version.

manio commented 3 years ago

@tiwai just checked your version - I have no sound but silence - both in aplay and speaker-test...

tiwai commented 3 years ago

Hm, you can try revert one by one to figure out which one broke; they are only 3 small commits.

manio commented 3 years ago

Morning! sure, @tiwai first try - reverting from the top to get the sound back:

2bee1d6 (HEAD -> master) Revert "a52: Don't swap bytes for the recent libavcodec for LE"
b5c9f5c Revert "a52: Correct data transfer in planar mode"
0b51a84 Revert "a52: Use av_frame_get_buffer() for buffer allocation"
d4266cd (origin/master, origin/HEAD) a52: Use av_frame_get_buffer() for buffer allocation
[...]

Now I have sound back - now I am trying only to revert this single commit... Now my tree is:

662906d (HEAD) Revert "a52: Don't swap bytes for the recent libavcodec for LE"
d4266cd (origin/master, origin/HEAD) a52: Use av_frame_get_buffer() for buffer allocation
[...]

Now the best news ever: the sound is crystal clear!!! :D:D

To sum it all: the commit which is wrong is: "a52: Don't swap bytes for the recent libavcodec for LE", after reverting only this one from @tiwai tree I have a correct sound! :)

@quequotion where are you man, can you also confirm if everything is working also on your hardware?

quequotion commented 3 years ago

@quequotion where are you man, can you also confirm if everything is working also on your hardware?

On the clock! I'll try some test builds when I get home tonight (\~19:00 JST). Most days I'm working 08:30\~23:00, including commutes.

Can't believe the progress we've made these last few weeks! After years of struggling to keep surround sound working with hacks, homebrewed packages, and copy pasta configurations, we're a just handful of commits away from ready out-of-the-box functionality.

Looking forward to it!

tiwai commented 3 years ago

Good to hear! Now I dropped the wrong patch and re-pushed. I'm going to send a PR with it.

perexg commented 3 years ago

Fixed via https://github.com/alsa-project/alsa-plugins/pull/26 . If you have a problem, please, open a new ticket.

manio commented 3 years ago

Thank you all very much! :)

perexg commented 3 years ago

If you like, please, retest with the current master branch. I pushed more cleanups there. The functionality should not be changed, but the more testing would be nice.

quequotion commented 3 years ago

Just building those updates now, was about to post that I had a few problems. Not sure if they are problems with this code though:

First, I can't get aplay to make a six-channel output from the LynnePublishing test file (static is rendered in mono), although it does with @manio's test file. On top of that, the speaker announcements from manio's file come out of the wrong speakers, but that could be because I haven't told aplay what order the speakers are in.

Then, pulseaudio just barely starts (had to comment out a lot of custom configuration) and then crashes the moment something plays audio through the a52 plugin.

Edit: Crashing persists even with a completely default pulseaudio configuration.

At first I thought this was because it seems to insist on reloading the plugin's sink with a different sample format (it creates the sink with s32le, then reloads with float32le), but even with a client (mpv) configured to use s32le, it still crashes as soon as audio is loaded.

Hopefully the cleanups help! I'll be back in a few minutes.

Edit: Bad news, pulseaudio still crashes any time audio is played through the plugin. Here's a log, note that it is 1874 lines long and mostly junk. You should probably work your way up from the bottom.

perexg commented 3 years ago

Don't use .ac3 file for tests. It's already the compressed format (like output from the a52 plugin). You need to convert them back to a PCM format before use (for aplay).

quequotion commented 3 years ago

That makes sense.

Still, either file is crashing pulseaudio where both should work there (edit: because the file encoding is irrelevant to pulseaudio, which should be sending the ouput of its client, mpv in this case, to plug:a52).

I'll see if I can find a specific commit that might be the cause (although audio quality was terrible, pulseaudio wasn't crashing yesterday).

Edit: pulseaudio would last output through the a52 plugin without crashing as of c12a6ae73e10673901ff9fc7da236e6f68f74aef

manio commented 3 years ago

@perexg tested current master (with libswresample converstion commit) and I still have good sound with aplay (using plug:a52encode though) because without it, it is telling me about sample format not available, but I assume this is ok in the light of our discussion. speaker-test is also fine. I am still using pure alsa and did not configure pulseaudio yet - so for now I think that @quequotion is the right tester for this stuff...

@quequotion If you are telling that c12a6ae was the last good one, so maybe you can also do similar thing as I did - trying to revert single commits one by one to get into the first which is starting the PA crashing issue?

quequotion commented 3 years ago

revert single commits one by one to get into the first which is starting the PA crashing issue?

My meaning was that the next commit, f11e7a8994111f971bc881d4069f768472452f4f, which changes the plugin to use s32le, is where it breaks down.

I did try to revert just that commit to test the later ones, but git wanted to do a manual merge and I decided to go to bed instead. Might have time for it tonight.

This troubles me because it seems like it's woking for ALSA, but pulseaudio has a problem with it.

I don't think pulseaudio has any code specific to this plugin, so I wouldn't expect its developers to take on working around changes to it.

On the other hand, it seems to be working well enough for ALSA-only setups, so it might seem like this is an issues that can be passed to the pulseaudio team to deal with.

I think the problem results from the recent commits and needs to be fixed in alsa-plugins, but maybe I should open a new issue for this.