StefH / Matroska

An Ebml based serializer to deserialize a Matroska file (.mkv or .webm)
MIT License
14 stars 1 forks source link

Unable to deserialize webm from audio recorded in Chrome #16

Open FossaMalaEnt opened 11 months ago

FossaMalaEnt commented 11 months ago

Hi,

At first, I wanna thank you for the time you're putting into this project!

I'm trying to record the audio from Chrome, route the WebM opus stream to .NET, convert to another format and re-route the audio to another system, for capturing the audio i'm using something like this:

        const stream = await navigator.mediaDevices.getUserMedia({ audio: true });

        this.mediaRecorder = new MediaRecorder(stream, {
            mimeType: 'audio/webm; codecs=opus'
        });

        this.mediaRecorder.start();

        this.isRecording = true;

        console.log("Starting media recorder ", this.mediaRecorder);

        this.mediaRecorder.ondataavailable = (e) => {
            //Do things, convert bytes to base64 and send the audio to the .NET back-end
        }

Now, following the examples, I'm using this code in order to extract the opus audio from the WebM stream:

        string b64Audio = audioPkt.Audio;
        byte[] webmBytes = Convert.FromBase64String(b64Audio);
        MemoryStream webmStream = new MemoryStream(webmBytes);

        var oggStream = new MemoryStream();

        MatroskaDemuxer.ExtractOggOpusAudio(webmStream, oggStream);

I confirm the webm bytes are the same recorded from the js, so no change in data, but in the line MatroskaDemuxer.ExtractOggOpusAudio(webmStream, oggStream); I got the following error:

System.ArgumentNullException: 'Value cannot be null. Arg_ParamName_Name'

Stack trace:

in Matroska.Muxer.OggOpus.OggOpusAudioStreamDemuxer.CopyTo(MatroskaDocument doc, Stream outputStream, OggOpusAudioStreamDemuxerSettings settings) in Matroska.Muxer.MatroskaDemuxer.ExtractOggOpusAudio(MatroskaDocument doc, Stream outputStream, OggOpusAudioStreamDemuxerSettings settings) in PROJECT_NAME.API.Hubs.LiveDataHub.SendAudio(ReceiveAudioDTO audioPkt) in PATH_TO_PROJECT\PROJECT_NAME.API\Hubs\LiveDataHub.cs: riga 70 in Microsoft.Extensions.Internal.ObjectMethodExecutor.<>c__DisplayClass33_0.b0(Object target, Object[] parameters) in Microsoft.AspNetCore.SignalR.Internal.DefaultHubDispatcher`1.d23.MoveNext()

I'm not able to understand what's the cause of the problem looking into the examples and in the various docs of the repo, Am I missing something? Any idea of what can be the issue?

StefH commented 11 months ago

Did you try the same code using a valid example webm file?

FossaMalaEnt commented 11 months ago

Nope, if you need this test to confim that your package works on my machine I'll immediatly do it, but my requirement is to get the opus audio stream from the webm opus recorded audio from Chrome, It's you're package supposed to works only when reading webm files (FileStream)?

StefH commented 11 months ago

Can you try this file? Estas Tonne - Internal Flight Experience (Live in Cluj Napoca).webm

FossaMalaEnt commented 11 months ago

Just tried the file you provided and with another file i've manually converted with an online converter and it looks working fine, the problem appear only when using audio data streamed from js using the MediaRecorder API.

If can be usefull, this is the MatroskaDocument of your webm file once deserialized:

{
    "Ebml": {
        "DocType": "webm",
        "DocTypeExtension": null,
        "DocTypeReadVersion": 2,
        "DocTypeVersion": 4,
        "EBMLMaxIDLength": 4,
        "EBMLMaxSizeLength": 8,
        "EBMLReadVersion": 1,
        "EBMLVersion": 1,
        "Void": null,
        "CRC32": null
    },
    "Segment": {
        "Attachments": null,
        "Chapters": null,
        "Clusters": [Too long
        ],
        "Cues": {
            "CuePoints": [Too long
            ],
            "Void": null,
            "CRC32": null
        },
        "Info": {
            "ChapterTranslate": null,
            "DateUTC": null,
            "Duration": 419941.0,
            "MuxingApp": "google/video-file",
            "NextFilename": null,
            "NextUID": null,
            "PrevFilename": null,
            "PrevUID": null,
            "SegmentFamily": null,
            "SegmentFilename": null,
            "SegmentUID": null,
            "TimestampScale": 1000000,
            "Title": null,
            "WritingApp": "google/video-file",
            "Void": null,
            "CRC32": null
        },
        "SeekHead": {
            "Seek": {
                "SeekID": "HFO7aw==",
                "SeekPosition": 218,
                "Void": null,
                "CRC32": null
            },
            "Void": null,
            "CRC32": null
        },
        "Tags": null,
        "Tracks": {
            "TrackEntries": [
                {
                    "AttachmentLink": null,
                    "Audio": {
                        "BitDepth": 16,
                        "ChannelPositions": null,
                        "Channels": 2,
                        "OutputSamplingFrequency": null,
                        "SamplingFrequency": 48000.0,
                        "Void": null,
                        "CRC32": null
                    },
                    "BlockAdditionMapping": null,
                    "CodecDecodeAll": 0,
                    "CodecDelay": 6500000,
                    "CodecDownloadURL": null,
                    "CodecID": "A_OPUS",
                    "CodecInfoURL": null,
                    "CodecName": null,
                    "CodecPrivate": "T3B1c0hlYWQBAjgBgLsAAAAAAA==",
                    "CodecSettings": null,
                    "ContentEncodings": null,
                    "DefaultDecodedFieldDuration": null,
                    "DefaultDuration": null,
                    "FlagDefault": 0,
                    "FlagEnabled": 0,
                    "FlagForced": 0,
                    "FlagLacing": 0,
                    "Language": "eng",
                    "LanguageIETF": null,
                    "MaxBlockAdditionID": 0,
                    "MaxCache": null,
                    "MinCache": 0,
                    "Name": null,
                    "SeekPreRoll": 80000000,
                    "TrackNumber": 1,
                    "TrackOffset": null,
                    "TrackOperation": null,
                    "TrackOverlay": null,
                    "TrackTimestampScale": 0.0,
                    "TrackTranslate": null,
                    "TrackType": 2,
                    "TrackUID": 27231307933070081,
                    "TrickMasterTrackSegmentUID": null,
                    "TrickMasterTrackUID": null,
                    "TrickTrackFlag": null,
                    "TrickTrackSegmentUID": null,
                    "TrickTrackUID": null,
                    "Video": null,
                    "Void": null,
                    "CRC32": null
                }
            ],
            "Void": null,
            "CRC32": null
        },
        "Void": null,
        "CRC32": null
    }
}

and this is the same obj but after a webm audio stream from Chrome:

{
    "Ebml": {
        "DocType": "webm",
        "DocTypeExtension": null,
        "DocTypeReadVersion": 2,
        "DocTypeVersion": 4,
        "EBMLMaxIDLength": 4,
        "EBMLMaxSizeLength": 8,
        "EBMLReadVersion": 1,
        "EBMLVersion": 1,
        "Void": null,
        "CRC32": null
    },
    "Segment": {
        "Attachments": null,
        "Chapters": null,
        "Clusters": null,
        "Cues": null,
        "Info": {
            "ChapterTranslate": null,
            "DateUTC": null,
            "Duration": null,
            "MuxingApp": "Chrome",
            "NextFilename": null,
            "NextUID": null,
            "PrevFilename": null,
            "PrevUID": null,
            "SegmentFamily": null,
            "SegmentFilename": null,
            "SegmentUID": null,
            "TimestampScale": 1000000,
            "Title": null,
            "WritingApp": "Chrome",
            "Void": null,
            "CRC32": null
        },
        "SeekHead": null,
        "Tags": null,
        "Tracks": {
            "TrackEntries": [
                {
                    "AttachmentLink": null,
                    "Audio": {
                        "BitDepth": 32,
                        "ChannelPositions": null,
                        "Channels": 1,
                        "OutputSamplingFrequency": null,
                        "SamplingFrequency": 48000.0,
                        "Void": null,
                        "CRC32": null
                    },
                    "BlockAdditionMapping": null,
                    "CodecDecodeAll": 0,
                    "CodecDelay": null,
                    "CodecDownloadURL": null,
                    "CodecID": "A_OPUS",
                    "CodecInfoURL": null,
                    "CodecName": null,
                    "CodecPrivate": "T3B1c0hlYWQBAQAAgLsAAAAAAA==",
                    "CodecSettings": null,
                    "ContentEncodings": null,
                    "DefaultDecodedFieldDuration": null,
                    "DefaultDuration": null,
                    "FlagDefault": 0,
                    "FlagEnabled": 0,
                    "FlagForced": 0,
                    "FlagLacing": 0,
                    "Language": null,
                    "LanguageIETF": null,
                    "MaxBlockAdditionID": 0,
                    "MaxCache": null,
                    "MinCache": 0,
                    "Name": null,
                    "SeekPreRoll": 0,
                    "TrackNumber": 1,
                    "TrackOffset": null,
                    "TrackOperation": null,
                    "TrackOverlay": null,
                    "TrackTimestampScale": 0.0,
                    "TrackTranslate": null,
                    "TrackType": 2,
                    "TrackUID": 8083526733988260,
                    "TrickMasterTrackSegmentUID": null,
                    "TrickMasterTrackUID": null,
                    "TrickTrackFlag": null,
                    "TrickTrackSegmentUID": null,
                    "TrickTrackUID": null,
                    "Video": null,
                    "Void": null,
                    "CRC32": null
                }
            ],
            "Void": null,
            "CRC32": null
        },
        "Void": null,
        "CRC32": null
    }
}
FossaMalaEnt commented 11 months ago

Just looked at the console and I noticed the following error (looks different from the debugging context, I don't know why):

ERROR: Segment at position 146 not mapped. Exception: NEbml.Core.EbmlDataFormatException: invalid element size value at NEbml.Core.EbmlReader.ReadNext() at Matroska.MatroskaSerializer.Deserialize(Type type, EbmlReader reader) fail: Microsoft.AspNetCore.SignalR.Internal.DefaultHubDispatcher[8] Failed to invoke hub method 'SendAudio'. System.ArgumentNullException: Value cannot be null. (Parameter 'Clusters') at Matroska.Muxer.OggOpus.OggOpusAudioStreamDemuxer.CopyTo(MatroskaDocument doc, Stream outputStream, OggOpusAudioStreamDemuxerSettings settings) at Matroska.Muxer.MatroskaDemuxer.ExtractOggOpusAudio(MatroskaDocument doc, Stream outputStream, OggOpusAudioStreamDemuxerSettings settings) at PROJECT_NAME.API.Hubs.LiveDataHub.SendAudio(ReceiveAudioDTO audioPkt) in PATH_TO_PROJECT\PROJECT_NAME\PROJECT_NAME.API\Hubs\LiveDataHub.cs:line 72 at Microsoft.Extensions.Internal.ObjectMethodExecutor.<>c__DisplayClass33_0.b0(Object target, Object[] parameters) at Microsoft.AspNetCore.SignalR.Internal.DefaultHubDispatcher1.ExecuteMethod(ObjectMethodExecutor methodExecutor, Hub hub, Object[] arguments) at Microsoft.AspNetCore.SignalR.Internal.DefaultHubDispatcher1.gExecuteInvocation|18_0(DefaultHubDispatcher1 dispatcher, ObjectMethodExecutor methodExecutor, THub hub, Object[] arguments, AsyncServiceScope scope, IHubActivator1 hubActivator, HubConnectionContext connection, HubMethodInvocationMessage hubMethodInvocationMessage, Boolean isStreamCall)

StefH commented 11 months ago

Can you provide your file here as attachment?

FossaMalaEnt commented 11 months ago

I used this code to save the file after recording the audio:

 this.mediaRecorder.ondataavailable = (e) => {
   const a = document.createElement("a");
   document.body.appendChild(a);
   const url = window.URL.createObjectURL(e.data);
   a.href = url;
   a.download = "test.webm";
   a.click();
   window.URL.revokeObjectURL(url);
   document.body.removeChild(a);

   //...
}

Here the test file recorded and saved with the posted code: test.webm

The problem persist when I try to extract the ogg from this webm, but the default media player of windows 10/11 can play the audio without problems.

StefH commented 11 months ago

It can be that this file follows a newer EBML / Matroska specification.

I'll take a look if I can update some code...

StefH commented 11 months ago

@FossaMalaEnt I've made a fix.

Can you try version 0.0.11-preview-01. ?

StefH commented 11 months ago

@FossaMalaEnt did you have time to test that preview version?

FossaMalaEnt commented 10 months ago

@StefH, sorry for the delay. I downloaded the code from the "issue16" branch to give it a try, but I'm having trouble compiling it in both VS 2019 and 2022. It seems that VS is unable to locate the "Span<>" class. Have you published the DLL somewhere?

StefH commented 10 months ago

I did push version 0.0.11-preview-01

image

Make sure to enable preview in VS: image

StefH commented 6 months ago

@FossaMalaEnt Did you have time to validate that preview version?

StefH commented 1 month ago

@FossaMalaEnt Did you have time to validate that preview version?