Vanilagy / mp4-muxer

MP4 multiplexer in pure TypeScript with support for WebCodecs API, video & audio.
https://vanilagy.github.io/mp4-muxer/demo
MIT License
419 stars 32 forks source link

position argument in the onData callback #27

Closed RavikumarTulugu closed 9 months ago

RavikumarTulugu commented 10 months ago

I just wanted to confirm my understanding on the position argument. I have a bit different requirement, in our app. we have the muxer in a worker, the worker forwards the muxed codec chunks to the main thread. The actual file write is happening in the main thread, we are using Streamtarget with onData callback for this.

Is it a correct assumption that the 'position' argument corresponds to the file write offset, the writer just needs to seek to the position and write the chunk at the position in the file ??

Please advise.

Vanilagy commented 10 months ago

That is correct! The chunks need to be written at the provided positions in the final file, and in the same order in which they are spit out by the muxer, for the output file to be correct.

RavikumarTulugu commented 10 months ago

I tried the above approach , forwarding muxed chunks to the main thread from web worker and writing them to the file. I am getting an unplayable file with some data and my mp4 player complains "unrecognized" format. I am pasting below muxer settings. I verified that all chunks are being written to the file honoring the 'position' argument.

how does 'finalize' work in case of streamedTarget ?? are my muxer settings correct for my usecase ?

target : Mp4Muxer.StreamTarget fastStart : false, firstTimestampBehavior : 'offset'

Please advise.

RavikumarTulugu commented 10 months ago

I am not expert on mp4 , first 100 bytes of a valid mp4 file look like this od -x intro.mp4 -N 100 0000000 0000 2000 7466 7079 7369 6d6f 0000 0002 0000020 7369 6d6f 7369 326f 7661 3163 706d 3134 0000040 0500 e3b7 6f6d 766f 0000 6c00 766d 6468 0000060 0000 0000 0000 0000 0000 0000 0000 e803 0000100 0300 4088 0100 0000 0001 0000 0000 0000 0000120 0000 0000 0100 0000 0000 0000 0000 0000 0000140 0000 0000 0000144

in my case it is like this od -x soundkone.net_recording_2023-12-12\ 15_22_05.mp4 -N 100 0000000 0000 1c00 7466 7079 7369 6d6f 0000 0000 0000020 7369 6d6f 7661 3163 706d 3134 0000 0100 0000040 646d 7461 0000 0000 0000 1000 fff8 f8fe 0000060 feff fff8 f8fe feff fff8 f8fe feff fff8 0000100 f8fe feff fff8 f8fe feff fff8 f8fe feff 0000120 fff8 f8fe feff fff8 f8fe feff fff8 f8fe 0000140 feff fff8 0000144

Vanilagy commented 10 months ago

To see if there's something wrong with your config, switch out StreamTarget for ArrayBufferTarget and send over one large buffer once muxing is complete. If that works, then there's likely something wrong with how you write the chunks created by the StreamTarget. Are you absolutely certain you are writing the chunks in-order, and at the correct offsets? I've tested this feature time and time again and it should work correctly, so I doubt it's a bug in my library.

If you could share the code you use to write out the file, I could help you.

RavikumarTulugu commented 10 months ago

Thanks, let me try and get back

RavikumarTulugu commented 10 months ago

test with arraybuffertarget and it works fine. it seems the bug is in our code.

Vanilagy commented 10 months ago

Have you found it yet? It can be tricky with writing files sometimes, sometimes seeking isn't happening.

RavikumarTulugu commented 10 months ago

Hi , Can you please try this at your end as well , after spending couple of days , i am still not able to figure out why such a simple thing wont work, let me describe our architecture and things i have tried at my end.

we send muxed chunks from worker to the front end over a message channel to write to disk, we have to move the muxer to the worker as we were not getting desired frame rate. The offset ( position ) and the data ( arraybuffer ) are both sent to the front end and the data is written to the file at the offset. I have tried filesystemtarget and arraybuffer target. both work flawless.

if you try at your end , i guess you will see this too.

from the onData callback of the streamtarget send the data to the front end over a message channel and write it to the file system. Actually, this is the general use case, any serious app will do the muxing in a worker and not in a front end.

I will to see if our code has any issues.

Vanilagy commented 10 months ago

I'm sorry this is causing you trouble. There's a definitive test you can do to check if this is an issue with StreamTarget or an issue with your file writing code. I always use this test to ensure it works correctly.

This is the test:

let buffer = new ArrayBuffer(0, { maxByteLength: 1e9 });
let bytes = new Uint8Array(buffer);

let muxer = new Mp4Muxer.Muxer({
    target: new Mp4Muxer.StreamTarget(
        (data, position) => {
            buffer.resize(position + data.byteLength);
            bytes.set(data, position);
        },
        () => {
            downloadBlob(new Blob([buffer.slice()]));
        }
    ),
    // ...
});

const downloadBlob = (blob: Blob) => {
    let url = window.URL.createObjectURL(blob);
    let a = document.createElement('a');
    a.style.display = 'none';
    a.href = url;
    a.download = 'davinci.mp4';
    document.body.appendChild(a);
    a.click();
    window.URL.revokeObjectURL(url);
};

It uses the StreamTarget to write directly into an Uint8Array, which simulates writing to a file. The final buffer is then downloaded.

Try using this code and sending the final buffer over to your frontend to download. If it works, then we know StreamTarget is not the culprit here. This code also shows nicely how writing is meant to happen, regarding byte offsets and so on.

RavikumarTulugu commented 10 months ago

Thanks for the input , i have tried sending a large buffer written by stream target to the main thread and it produced a perfect file. now , it is clear the library has no issues. regards,

Vanilagy commented 10 months ago

How are you writing to disk, like what API are you using?

RavikumarTulugu commented 10 months ago

i am using writable.write with offset

Vanilagy commented 10 months ago

Maybe this can help: https://github.com/Vanilagy/webm-muxer/issues/25#issuecomment-1732441066

RavikumarTulugu commented 10 months ago

image

Please look at the screenshot of the first 16 bytes of the packet dump returned by the onData callback. Every thing works fine till line 5 in the image, from line 5 the packets go out of sync , orig is the packet given by the library and the copy is a copy packet, as we see they go out of sync for few packets like moov , mdat etc.

console.log is an asynchronous function it keeps a reference of the variables it is given and expands them when it prints, so , copy is the actual data and the orig may have been modified by the library by the time it gets printed.

Copy remains out of sync with the orig and while all the data is perfect, the segment headers never get written to the final file. This segment information gets written in the finalize step which is never seen by the onData callback. I am collecting till the final packet. some thing to do with the finalize step i guess. If the logic is being shared between webm and mp4 then likely we have similar issue in the webm code as well.

for reference i am pasting the code of onData callback ,

  let target = null;
    let sendChunkToRecorder = ( data, position ) => {
      let sview = new DataView ( data.buffer );
      let abuf = new ArrayBuffer ( data.length );
      let dview = new DataView ( abuf );
      for ( let i = 0; i < data.length ; i++ ) { 
        dview.setUint8 ( i, sview.getUint8 ( i ) );
      }   
      //console.log ( '=> position : ', position, ' data.len : ',  abuf.byteLength );
      hexdump ( 'orig:', data.subarray ( 0, 16 ) );
      hexdump ( 'copy:', new Uint8Array ( abuf, 0, abuf.byteLength < 16 ? abuf.byteLength : 16 ) );
      recorderInputPort.postMessage ({ offset : position, data : abuf }, [ abuf ]); 
    };  
    if ( recordingContainerFormat == 'mp4' ) { 
      target = new Mp4Muxer.StreamTarget (
        ( data, position ) => {
          sendChunkToRecorder ( data, position );
        }, //onData
        () => {
        }, //onDone
        { chunked : true, chunkSize : 4 * 1024 });
RavikumarTulugu commented 10 months ago

reopening issue for further investigation

Vanilagy commented 10 months ago

I see there may be an issue here with incorrect reuse of buffers meaning that buffer contents change after they are emitted by the onData callback (can't confirm yet).

If you copy the buffer immediately after receiving it from the StreamTarget, and then write those copies into a file, is the resulting file correct? Should be, right? Since then you'd be doing exactly what this does.

Vanilagy commented 10 months ago

I doubt the data emitted by the StreamTarget changes tho?? When new data is added to the StreamTarget internally, I do this: CleanShot 2023-12-22 at 17 21 53@2x

The data is duplicated using slice and can't change again. And also your statement about console.log being asynchronous isn't exactly right; it is synchronous because it has to evaluate its arguments at the time it is called. You were probably referring to those "rich" logs, where it's an object you can expand by clicking an arrow, which are indeed evaluated only when you expand them. Unless your hexdump is async tho, I can't say if it is because you haven't included it.

I also think it's strange that it is your copies which are all zeroes? Like, shouldn't the copies contain proper data and the original gets messed up? I also notice that your copies only are all zeroes when they are shorter than 16 bytes, so what's up there? Strange stuff indeed.

Maybe it has something to do with postMessage. Since you're transferring the buffer, the buffer will get "detached" on the transferring side, meaning it gets truncated to length 0. Then, when you make a Uint8Array out of it, it displays all zeroes.

RavikumarTulugu commented 9 months ago

Thanks for the reply , I can wait till the end of holidays , please enjoy your holidays.

coming to the analysis. please look at the points below

i) The code works fine for all the packets and it fails for the last few packets, both the copy and orig are same byte by byte, only the last few packets go out of sync, the packets are Trak, mdia, stbl, minf, moov.

ii) Yes the logs i were referring are rich logs, which contain complex objects. so, these are expanded lazily i guess, i might be wrong though, there is no clear evidence for this on internet and every browser does this differently.

iii ) The buffer being sent through postmessage is allocated afresh every time and is sent to the consumer as detached buffer. abuf enclosed in []'s. recorderInputPort.postMessage ({ offset : position, data : abuf }, [ abuf ]);

Below is the new dump which is dumping first 8 bytes of the buffer , again these are the same packets which are all zeros. only particular packets are all zero's [ the packets are enclosed in yellow square ]

image

here is the code for hexdump, i just lifted it out of internet.

function hexdump( uint8Array ) {
  const chunkSize = 8;
  for (let i = 0; i < uint8Array.length; i += chunkSize) {
    const chunk = uint8Array.subarray(i, i + chunkSize);
    const hexValues = Array.from(chunk, byte => byte.toString(16).padStart(2, '0')).join(' ');
    const asciiValues = Array.from(chunk, byte => (byte >= 32 && byte <= 126) ? String.fromCharCode(byte) : '.').join('');    console.log(
      `${i.toString(16).padStart(8, '0')} | ${hexValues.padEnd(chunkSize * 3 - 1)} | ${asciiValues}`
    );     
  }
}
RavikumarTulugu commented 9 months ago

adding one more line in the onData callback , basically taking a copy of the data and using it solved the problem. pasting the new callback for reference.

 let sendChunkToRecorder = ( data, position ) => {
      let data2 = new Uint8Array ( data );
      let sview = new DataView ( data2.buffer );
      let abuf = new ArrayBuffer ( data2.length );
      let dview = new DataView ( abuf );
      for ( let i = 0; i < data2.length ; i++ ) {
        dview.setUint8 ( i, sview.getUint8 ( i ) ); 
      }     
      //console.log ( '=> position : ', position, ' data.len : ',  abuf.byteLength );
      hexdump ( 'orig:', data2.subarray ( 0, 8 ) );
      hexdump ( 'copy:', new Uint8Array ( abuf, 0, abuf.byteLength < 8 ? abuf.byteLength : 8 ) );
      recorderInputPort.postMessage ({ offset : position, data : abuf }, [ abuf ]);
    };    
Vanilagy commented 9 months ago

Glad this works, strange it didn't before. I think you could simplify your copying logic:

 let sendChunkToRecorder = ( data, position ) => {
      let abuf = data.buffer.slice();
      recorderInputPort.postMessage ({ offset : position, data : abuf }, [ abuf ]);
    };