PortAudio / portaudio

PortAudio is a cross-platform, open-source C language library for real-time audio input and output.
Other
1.43k stars 297 forks source link

wasapi: Add passthrough support (AC3, E-AC3, ...) #778

Closed gabys999 closed 1 year ago

gabys999 commented 1 year ago

add AC3 and E-AC3 passthrough support for wasapi. The example is in the test folder.

dmitrykos commented 1 year ago

@gabys999 thank you for PR. Please correct failing tests first and code style according to my previous comments. Note spaces in all your code and align accordingly to make code looking nice and uniform with other PA WASAPI code.

dmitrykos commented 1 year ago

@gabys999 thank you for the changes, although you forgot moving 192000 constant into separate define.

Your implementation looks suboptimal to me and confusing in GetAlternativeSampleFormatExclusive where the use of WAVEFORMATEXTENSIBLE_IEC61937 is not really needed. The same is for IsFormatSupported.

Please see my comments, it needs to be corrected to move further due to many changes to be reverted.

gabys999 commented 1 year ago

Thank you for pointing out these problems. Previously, we usually only used upper level functions, such as Pa StartStream, Pa WriteStream, lack of understanding of wasapi. We modified the GetAlternativeSampleFormatExclusive and IsFormatSupported to worry that if the E-AC3 passthrough needs to call these functions, the unallocated memory will cause program crash. Our purpose of submitting the PR is also to collaborate with you so that portaudio can stably support AC3 and E-AC3 audio passthrough for our project. I have reverted the code you mentioned.

dmitrykos commented 1 year ago

@gabys999 thank you for your corrections!

Our purpose of submitting the PR is also to collaborate with you so that portaudio can stably support AC3 and E-AC3 audio passthrough for our project.

Your kind input is indeed greatly appreciated. The goal of my requests for corrections is to keep PA WASAPI implementation uniform and optimal once your contribution is merged into the repository.

dmitrykos commented 1 year ago

The reason of a crash is that WAVEFORMATEXTENSIBLE_IEC61937 is larger than WAVEFORMATEXTENSIBLE therefore writing anything to the additional members of WAVEFORMATEXTENSIBLE_IEC61937 inside MakeWaveFormatFromParams will corrupt the stack memory because currently stack memory is allocated for WAVEFORMATEXTENSIBLE.

Now when you cleaned up those functions we need to fix implementation and allocate enough stack memory for both structures in all places where WAVEFORMATEXTENSIBLE is used. For that I propose to define such type (after definition of WAVEFORMATEXTENSIBLE_IEC61937):

typedef union _WAVEFORMATEXTENSIBLE_UNION
{
    WAVEFORMATEXTENSIBLE          ext;
    WAVEFORMATEXTENSIBLE_IEC61937 iec61937;
} WAVEFORMATEXTENSIBLE_UNION;

and then replace WAVEFORMATEXTENSIBLE allocation inside functions or as member variable with WAVEFORMATEXTENSIBLE_UNION where needed in PA WASAPI implementation to guarantee absence of memory corruption. I hope you got my idea.

In case you wish to join forces, please give me write access to your repository and I can implement proposed changes. Then we will check and discuss if all works as expected. Let me know what you think.

dmitrykos commented 1 year ago

I quickly went through the code and applied WAVEFORMATEXTENSIBLE_UNION to:

typedef struct PaWasapiSubStream:
 -> change     WAVEFORMATEXTENSIBLE wavex;
 -> to         WAVEFORMATEXTENSIBLE_UNION wavexu;

static PaError MakeWaveFormatFromParams(WAVEFORMATEXTENSIBLE_UNION *wavexu, const PaStreamParameters *params,
    double sampleRate, BOOL packedOnly)

static HRESULT GetAlternativeSampleFormatExclusive(IAudioClient *client, double sampleRate,
    const PaStreamParameters *params, WAVEFORMATEXTENSIBLE_UNION *outWavexU, BOOL packedSampleFormatOnly)

static PaError GetClosestFormat(IAudioClient *client, double sampleRate, const PaStreamParameters *_params,
    AUDCLNT_SHAREMODE shareMode, WAVEFORMATEXTENSIBLE_UNION *outWavexU, BOOL output)

Inside GetClosestFormat I used it like that:

WAVEFORMATEXTENSIBLE *outWavex   = (WAVEFORMATEXTENSIBLE *)outWavexU;

Besides in PaWasapiSubStream the WAVEFORMATEXTENSIBLE_UNION must be used as storage type in GetAlternativeSampleFormatExclusive, IsFormatSupported.

When compiling compiler will throw a lot of warnings about incompatible types used, as well as PaWasapiSubStream::wavex will be absent, so warnings and compile errors need to be corrected accordingly that will result in a working solution.

gabys999 commented 1 year ago

You are right. The code I modified before is mainly for ac3 and eac3. It does not consider supporting more formats uniformly, so it limits the code implementation.

According to the definition of WAVEFORMATEXTENSIBLE_IEC61937:

typedef struct {
    WAVEFORMATEXTENSIBLE  FormatExt;    /* Format of encoded data as it is */
                                        /* intended to be streamed over the link */
    DWORD   dwEncodedSamplesPerSec;     /* Sampling rate of the post-decode audio. */
    DWORD   dwEncodedChannelCount;      /* Channel count of the post-decode audio. */
    DWORD   dwAverageBytesPerSec;       /* Byte rate of the content, can be 0. */
} WAVEFORMATEXTENSIBLE_IEC61937, *PWAVEFORMATEXTENSIBLE_IEC61937;

My opinion is that WAVEFORMATEXTENSIBLE_IEC61937 is compatible with WAVEFORMATEXTENSIBLE, wasapi interface can determine whether the type of the input parameter is WAVEFORMATEXTENSIBLE_IEC61937 or WAVEFORMATEXTENSIBLE through FormatExt.Format.cbsize , the interface can correctly use the allocated stack memory. Is it necessary to define WAVEFORMATEXTENSIBLE_UNION, or use WAVEFORMATEXTENSIBLE_IEC61937 to replace WAVEFORMATEXTENSIBLE completely?

I added you to my repository and look forward to your implementation.

dmitrykos commented 1 year ago

Is it necessary to define WAVEFORMATEXTENSIBLE_UNION, or use WAVEFORMATEXTENSIBLE_IEC61937 to replace WAVEFORMATEXTENSIBLE completely?

My idea is to differentiate them. Even though WAVEFORMATEXTENSIBLE_IEC61937 is compatible with WAVEFORMATEXTENSIBLE it is declared by MS for use with SPDIF (IEC61937) format specifically and therefore it may be a bit confusing for other developers who consider only PCM case.

Therefore, to my opinion, union in this case solves ambiguity by differentiating to ext and iec61937 members which clearly denote target use.

I propose to wait for #773 merging because it overlaps with this PR and then I will apply my described approach in your repository.

gabys999 commented 1 year ago

I propose to wait for #773 merging because it overlaps with this PR and then I will apply my described approach in your repository.

No problem, I agree with you.

RossBencina commented 1 year ago

773 is merged now.

dmitrykos commented 1 year ago

@RossBencina thank you very much. @gabys999, I committed changes and PCM mode works fine for me. Would you please check SPDIF functionality on your side with these new changes.

gabys999 commented 1 year ago

Okay, I'll check it as soon as possible

gabys999 commented 1 year ago

@dmitrykos I ran the test program in AC3 mode and E-AC3 mode on the Dobly CP850, and their output was correct.

AC3 ac3

E-AC3 eac3

dmitrykos commented 1 year ago

That's great, thank you very much for testing! Let's wait for @RossBencina review, from my side final implementation looks fine.

RossBencina commented 1 year ago

@gabys999 Thanks for contributing this, I think it's a great addition. I have added my review comments above.

dmitrykos commented 1 year ago

@RossBencina I reverted PaWasapiStream::isStopped, the rest of your comments are covering @gabys999's implementation, so I will not interfere and let @gabys999 to adjust those parts of code.

gabys999 commented 1 year ago

@RossBencina I created this test file based on paex_wmme_ac3.c, and it seems that there were some omissions. I will modify them immediately.

RossBencina commented 1 year ago

@gabys999 @dmitrykos I think this is good to merge now (subject to a final check with Phil later in the week), but maybe a clarifying comment for the assignment to wasapiStreamInfo.channelMask in the examples would be nice.

gabys999 commented 1 year ago

@gabys999 @dmitrykos I think this is good to merge now (subject to a final check with Phil later in the week), but maybe a clarifying comment for the assignment to wasapiStreamInfo.channelMask in the examples would be nice.

Okay, I'll test it later

gabys999 commented 1 year ago

image @dmitrykos @RossBencina I tested AC3 and EAC3 separately, and each test is assigned five different value to wasapiStreamInfo.channelMask . No matter what wasapiStreamInfo.channelMask is assigned to, it will not affect the result of the passthrough output. It does seem to work regardless of the assignment.

dmitrykos commented 1 year ago

No matter what wasapiStreamInfo.channelMask is assigned to, it will not affect the result of the passthrough output. It does seem to work regardless of the assignment.

I think it is because your receiver or drivers probably has ability to guess channel arrangement in the data stream therefore channel arrangement provided in channelMask and dwEncodedChannelCount are ignored. I propose to stick to our current implementation as it is now and have dwEncodedChannelCount assigned as per MS docs just to be on the safe side. Maybe it is also driver-dependent functionality.

So let's wait for a final review by @philburk as earlier @RossBencina proposed.

RossBencina commented 1 year ago

@gabys999 thanks for testing. I think we have all the information we need for a final review. Hopefully Phil and I will have time to address this tomorrow.

Dmitry wrote:

I propose to stick to our current implementation as it is now and have dwEncodedChannelCount assigned as per MS docs just to be on the safe side. Maybe it is also driver-dependent functionality.

I see. That sounds reasonable to me. Just thinking out loud here, but if the MS docs say dwEncodedChannelCount needs to be set according to the content of the SPDIF stream then obviously the channel count needs to be communicated via Pa_OpenStream somehow, irrespective of how particular drivers behave. Right now you're doing that using wasapiStreamInfo.channelMask and PopCount(channelMask). This seems OK to me. The alternatives are:

  1. use outputParameters.channelCount. Unfortunately if outputParameters.channelCount is used it makes for other confusion, because the PA callback will deal with 2 channel 16 bit frames, even though outputParameters.channelCount is set to e.g. 6.
  2. add a separate wasapiStreamInfo.encodedChannelCount field.

I'm not arguing for either of these alternatives. But I will discuss them with Phil.

dmitrykos commented 1 year ago
  1. add a separate wasapiStreamInfo.encodedChannelCount field.

@RossBencina I am in favor of adding an additional members to PaWasapiStreamInfo and make a direct link to additional parameters of WAVEFORMATEXTENSIBLE_IEC61937 as such:

typedef enum PaWasapiPassthroughFormat
{
    ePassthroughFormatDolbyDigital     = 0x00920000,
    ePassthroughFormatDolbyDigitalPlus = 0x000a0cea,
    // and so on
}
PaWasapiPassthroughFormat;

typedef struct PaWasapiStreamPassthroughInfo
{
    PaWasapiPassthroughFormat formatId;
    unsigned long encodedSamplesPerSec;
    unsigned long encodedChannelCount;
    unsigned long AverageBytesPerSec;
}
PaWasapiStreamPassthroughInfo;

typedef struct PaWasapiStreamInfo
{
    ...

    /** Stream category.
     @see PaWasapiPassthroughFormat
     @version Available as of 19.7.0
    */
    PaWasapiStreamPassthroughInfo passthroughInfo;
}
PaWasapiStreamInfo;

Then we can have just one paWinWasapiPassthrough = (1 << 7) flag which denotes presence of filled by the user PaWasapiStreamPassthroughInfo.

With such approach our implementation becomes more generic and will support all passthrough WASAPI formats specified in MS docs, not just these 2. With this we will avoid PopCount and any speculations about what encodedChannelCount shall be by delegating this task to the user.

What do you think about this proposal? If it makes sense I will update currently achieved implementation to such approach.

RossBencina commented 1 year ago

What do you think about this proposal? If it makes sense I will update currently achieved implementation to such approach.

@dmitrykos sounds good to me. I like the idea of making it more generic.

RossBencina commented 1 year ago

With regard to the data format supplied to PortAudio. I am confused about the relationship between the 24-bit 2 channel AES3/SPDIF payload (e.g. as described here and the 16-bit 2 channel IEC61937 payload required by Microsoft's API (see code example at the bottom of this MSDN page. To me it seems like Microsoft is adding at least some of the IEC61937 framing into those extra 8 bits. Which casts doubt on exactly what the format of the "SPDIF ac3" stream that must be generated by the PA callback. Can you guys clarify please?

gabys999 commented 1 year ago

I read Representing Formats for IEC 61937 Transmissions and think some details may still need to be addressed.

The latest vlc source code is as follows:

case VLC_CODEC_EAC3:
        wf->SubFormat = _KSDATAFORMAT_SUBTYPE_IEC61937_DOLBY_DIGITAL_PLUS;
        wf->Format.nChannels = 2;
        wf->dwChannelMask = KSAUDIO_SPEAKER_5POINT1;
        break;
...
    wf->Format.wFormatTag = WAVE_FORMAT_EXTENSIBLE;
    wf->Format.nSamplesPerSec = 192000;
    wf->Format.wBitsPerSample = 16;
    wf->Format.nBlockAlign = wf->Format.wBitsPerSample / 8 * wf->Format.nChannels;
    wf->Format.nAvgBytesPerSec = wf->Format.nSamplesPerSec * wf->Format.nBlockAlign;
    wf->Format.cbSize = sizeof (*wf_iec61937) - sizeof (wf->Format);

    wf->Samples.wValidBitsPerSample = wf->Format.wBitsPerSample;

    wf_iec61937->dwEncodedSamplesPerSec = audio->i_rate;
    wf_iec61937->dwEncodedChannelCount = audio->i_channels;
    wf_iec61937->dwAverageBytesPerSec = 0;

For all IEC61937-based parameter settings, vlc sets Format.nSamplesPerSec to 192000. So when I referenced its code, I also defined PA_WASAPI_EAC3_SAMPLES_PER_SEC_ (192000) in portaudio.

When Dolby Digital Plus content is transmitted over IEC 60958, the link-sampling rate must be four times the sampling rate of the content. Dolby Digital Plus supports content sample rates of 32 KHz, 44.1 KHz, and 48 KHz. Interfaces such as HDMI do not support 128 KHz (32 KHz x 4), so only 44.1 and 48 KHz content sample rates can be supported.

But this page says that DOLBY_DIGITAL_PLUS supports 44.1k and 48k content sampling rates, which means that Format.nSamplesPerSec should be set to dwEncodedSamplesPerSec * 4. Also, I'm not sure if FormatExt.dwChannelMask = KSAUDIO_SPEAKER_5POINT1 and dwEncodedChannelCount = 6 don't need to be modified, the comments on the example don't add always.

gabys999 commented 1 year ago

The audio part of my new test file is in Dolby Atmos format, I converted it to pure audio format of EAC3 via ffmpeg. When playing EAC3 audio with portaudio_test, the dolby web status shows 5.1 channels, when playing the test file with our application that has been updated to the latest version of portaudio_x64.dll, the dolby web status shows atmos. image

image

dmitrykos commented 1 year ago

@gabys999 I updated implementation and made it generic, so now it will accept all other passthrough formats. I also updated tests.

Would you please check how it works for you.

Note that sampling rate is now set externally by the user, e.g. for Dolby Digital Plus for 5.1 we specify 192000 for the PCM transport stream and passthrough.encodedSamplesPerSec gets 192000 / 4 = 48000. The same is for the channels, now it is passed via passthrough.encodedChannelCount member. As a result no guessing inside PA WASAPI and much cleaner code.

gabys999 commented 1 year ago

I tried to play dolby atmos files in spdif format with different parameters, for example:

    wasapiStreamInfo.flags = paWinWasapiExclusive | paWinWasapiUseChannelMask | paWinWasapiPassthrough;
    wasapiStreamInfo.channelMask = PAWIN_SPEAKER_7POINT1;
    wasapiStreamInfo.passthrough.formatId = ePassthroughFormatDolbyDigitalPlusAtmos;
    wasapiStreamInfo.passthrough.encodedSamplesPerSec = SAMPLE_RATE / 4;
    wasapiStreamInfo.passthrough.encodedChannelCount = 8;

Always get the wrong result WASAPI ERROR HRESULT: 0x88890008 : AUDCLNT_E_UNSUPPORTED_FORMAT

I saw a message in chromium's historical source code:

// Test Dolby Digital+ / Atmos support. For whatever reason Atmos doesn't use // the KSDATAFORMAT_SUBTYPE_IEC61937_DOLBY_DIGITAL_PLUS_ATMOS SubFormat.

I didn't find any Dolby Atmos related implementation in the latest chromium code, maybe DolbyDigitalPlus is a compromise implementation? I changed DolbyDigitalPlusAtmos to DolbyDigitalPlus and the Dolby CP850 device are able to recognize the 7.1 channels atmos data, which is the passthrough parameters currently used by our application.

dmitrykos commented 1 year ago

Because I changed DolbyDigitalPlusAtmos to DolbyDigitalPlus and the Dolby CP850 device are able to recognize the 7.1 channels atmos data, that's the way we are currently using.

It is most likely a driver issue or OS-side WASAPI bug. I double checked PA WASAPI and it seems I did not make mistake when copying value of the KSDATAFORMAT_SUBTYPE_IEC61937_DOLBY_DIGITAL_PLUS_ATMOS GUID - 0000010a-0cea-0010-8000-00aa00389b71 where we use first 16 bits of Data1 part and all 16 bits of Data2 part in the ePassthroughFormatDolbyDigitalPlusAtmos which is - 0x010a0cea.

gabys999 commented 1 year ago

I think so too, I checked the parameters and GUID when the error occurred, and I think they are not wrong. In the latest code, the test program plays ac3 and eac3 passthrough correctly, and also plays atmos with the eac3 parameters.

dmitrykos commented 1 year ago

@gabys999 thank you very much for testing, I am glad that now all works as expected! Let's wait for @RossBencina and @philburk for their final word. I think we reached quite optimal implementation in the long run.

gabys999 commented 1 year ago

I'm glad to be involved in the implementation of the passthrough feature and I think portaudio will work even better in our project. If you need me later, please @ me

dmitrykos commented 1 year ago

@RossBencina to my view we are pushing too many changes in this PR. It could be merged as-is now as it has optimal initial implementation and then it could be improved in the next PR, for example in relation to bufferSampleCount vs bufferFrameCount.

@gabys999's implementation is also working ok although it relies on the fact implicitly that data is aligned to the frame count.

gabys999 commented 1 year ago

@dmitrykos @RossBencina I apologize for only just seeing the notice. For the past while, I was dealing with another difficult issue on the project. For this issue, we changed the output environment and the Dolby device was temporarily moved away in order to solve the new problem. I'm not sure how long the current issue will take to deal with, my plan is to resolve it in one to two weeks. Next, we also need to debug the wasapi, as the testing department reported a bug in the LTC codec after updating to the new version of portaudio, as well as accessing the Dobly device again

dmitrykos commented 1 year ago

I think we really need to merge this PR as is because it was tested and is working correctly. Other things, like improvement to the tests, could be improved in the subsequent iteration in a new PR.

RossBencina commented 1 year ago

@dmitrykos @gabys999 ok, we're going to merge it, but I'm not happy with leaving the issues that I have flagged unresolved. Without clarification I don't think the tests are satisfactory as they stand, for the reasons that I have stated in my latest comments.

dmitrykos commented 1 year ago

@philburk thank you for improving the PR! I hope @gabys999 will apply changes proposed by @RossBencina in the next iteration when there is possibility to test new changes.

gabys999 commented 1 year ago

@philburk thank you for improving the PR! I hope @gabys999 will apply changes proposed by @RossBencina in the next iteration when there is possibility to test new changes.

Okay, we'll restore the Dolby device environment next week, and I'll test @RossBencina suggested code while we test the 5.1 AC3 passthrough output.