mbebenita / Broadway

A JavaScript H.264 decoder.
Other
2.73k stars 424 forks source link

Large garbage collections while streaming #195

Open madiganz opened 5 years ago

madiganz commented 5 years ago

I am not sure if there is anything that can be done about this, but I am successfully showing live video by decoding h264 using Player.js. The video data comes in via WebSocket at 30 fps. Frequently, a garbage collection occurs, which can take around 100ms (sometimes even longer), which causes no frames to be rendered onto a canvas during this time. I am currently using 4 players on one page and initialize the players using:

new Player({
    useWorker: true,,
    reuseMemory: true
})

I have recently been experimenting and it looks like the reuseMemory config option is helping. Is this the correct approach? Is there anything else I can do help with this problem?

soliton4 commented 5 years ago

you have to follow the life time of the binary arrays when you use reuseMemory this is not an option i am officially supporting right now

madiganz commented 5 years ago

Okay, can you think of another way to reduce the number of garbage collections?

soliton4 commented 5 years ago

i think you are on the right track. there is one binary variable for every frame. reusing those will reduce the gc

madiganz commented 5 years ago

Makes sense. I will see if modifying the player/decoder code to see if I can achieve what I want.

madiganz commented 5 years ago

I am looking at the transferMemory option, specifically these lines of code:

if (this._config.transferMemory){
    this.decode = function(parData, parInfo){
        // no copy
        // instead we are transfering the ownership of the buffer
        // dangerous!!!
        worker.postMessage({buf: parData.buffer, offset: parData.byteOffset, length: parData.length, info: parInfo}, [parData.buffer]); // Send data to our worker.
    };   
}else{
    this.decode = function(parData, parInfo){
        // Copy the sample so that we only do a structured clone of the
        // region of interest
        var copyU8 = new Uint8Array(parData.length);
        copyU8.set( parData, 0, parData.length );
        worker.postMessage({buf: copyU8.buffer, offset: 0, length: parData.length, info: parInfo}, [copyU8.buffer]); // Send data to our worker.
    };
};

The else branch is when transferMemory is false, but according to https://developer.mozilla.org/en-US/docs/Web/API/Worker/postMessage, the array in the 2nd parameter is postMessage() will have its ownership transferred. According to this, both paths will transfer ownership of the underlying ArrayBuffer. Is this correct?

I also notice:

// buffer needs to be copied because we give up ownership
var copyU8 = new Uint8Array(getMem(buffer.length));
copyU8.set( buffer, 0, buffer.length );

Based on my research. the buffer does NOT need to be copied when ownership is given up. When the buffer is transferred, the sender no longer has access to the data.

soliton4 commented 5 years ago

yup

madiganz commented 5 years ago

Was this code not optimized then? I am slightly lost on where to re-use the ArrayBuffers. getMem() in Decoder.js will actually create a new ArrayBuffer if it is not there, but we can just use the ArrayBuffer that was send in postMessage().

soliton4 commented 5 years ago

i think you are on the right track. its a simple logic that just spans over 2 threads

soliton4 commented 5 years ago

actually i just thought of something. if the reuse of arraybuffers works for you accross threads and you still see a big gc, then perhaps its the opengl textures instead. you could write a similar reuse for them, its even simpler since it stays in the same thread. if you recontribute that code you would make a huge contribution to this project and earn at least one hero point

madiganz commented 5 years ago

Let me see if I understand this correctly. In my case, I am using a WebSocket to receive a frame, so an ArrayBuffer is created when I receive the frame. This means that there will always be one ArrayBuffer per frame. The trick is then to reuse this single ArrayBuffer (for each frame) throughout the entire process. From what I can tell, just because the worker is on a different thread, does not mean it needs to create a new ArrayBuffer when it receives a message, it can keep using the same ArrayBuffer that was send via the postMessage call.

It seems like the trick will be to never create a new ArrayBuffer, and instead, create typed array views of the ArrayBuffer.

soliton4 commented 5 years ago

not quite the websocket will most likely allways create a new array buffer those buffers also have different sizes if its pure annex b

where you can reuse memory is in the output buffers. they allways have the same size and are constantly created by the decoding thread, unless you send them back to be reused

madiganz commented 5 years ago

That is what I meant about the WebSocket part. It will always create an ArrayBuffer. In my scenario, I am running it through the decoder and then drawing it on a canvas using Player..js. I believe you are saying that after the decode is done, the ArrayBuffer will always be the same size?

If this is true, then the input to the decoder will need to be an ArrayBuffer for each frame. When the decoder sends the ArrayBuffer back to to the player, the player would then send that same ArrayBuffer back for reuse. If they are always the same size, wouldn't it be better to have an ArrayBuffer pool?

madiganz commented 5 years ago

where you can reuse memory is in the output buffers. they allways have the same size and are constantly created by the decoding thread, unless you send them back to be reused

Isn't that what the reuseMemory config already attempts to do?

if the reuse of arraybuffers works for you accross threads and you still see a big gc, then perhaps its the opengl textures instead. you could write a similar reuse for them, its even simpler since it stays in the same thread

Are you referring to reusing the actual textures? It looks to me like those textures are created during the init phase and then reused.

soliton4 commented 5 years ago

yes and yes not sure if there are more ways to reduce gc

madiganz commented 5 years ago

One thing that may help is Uint8Array pooling.