Add support for ring buffer mode

scabot commented 8 years ago

… the user to allocate a buffer (zero copy). Add the total_size of a buffer to ImxVpuFrameBuffer

There are two changes in this commit: The first and easy one is just adding the total_size to ImxVpuFrameBuffer. We need this internally to wrap the these buffers int our own memory allocator. Others may need this as well.

The second more interesting change is to allow directly calling a write style function instead of asking the user to supply an output buffer using acquire_output_buffer and releasing it using finish_output_buffer. This enables several advantages:

No need for memory allocation while encoding or managing pre-allocated buffers
Zero copy (or at least let the user decide what type of copy she desires)
A possibility to directly write to a file descriptor (file,pipe,socket...)

scabot commented 8 years ago

After adding the ability to call a user function to write the output data without copying I went the extra mile and enabled proper support for ring buffer mode which allows for low latency streaming

dv1 commented 8 years ago

Thanks for this. These changes do look very interesting. The ring buffer mode and callback-style encoding have been on my list of things to experiment with, but you took care of that :)

Unfortunately I was very busy over the last two weeks, so there was no time to check them out. I'll do that over the next few days. That said, please ensure that you use tabs and not spaces, otherwise the indentation is mixed up like it is now. I can correct it this time, it is just something to keep in mind for the next time.

scabot commented 8 years ago

Thank you for testing this I committed some more changes to the way we wait for the frame to complete and separated the logic for full frame from the logic for ring buffer. This commit tries to balance between latency and the overhead of the syscalls issued by vpu_waitForInt. Also tried to fix the tab/space thing. Hope I didn't make things worse. BTW. I will take this opportunity to thank you for all the work you put into this library.

otavio commented 8 years ago

News on this pull request? @dv1 ?

dv1 commented 8 years ago

There are still issues with tabs & spaces (see the comments, where spaces should be used for example), but I can fix these. Commit looks OK overall. I'll merge it in a few days after fixing the remaining whitespace issues & testing it.

otavio commented 8 years ago

@scabot could you rework the commits fixing the issues spotted by @dv1 ? This would allow him to proceed with the testing and merge..

dv1 commented 8 years ago

General rule is to use tabs for indentation and spaces for alignment. And, inside comments, I pretty much never use tabs.

scabot commented 8 years ago

Ok. So I fixed the spaces and tabs (I hope). But pushing this fix also pushed the latest changes I did to support rotation and mirroring. I didn't want to push these changes yet since there is a bug with rotating 90 and 270 degrees. I am not sure this is my bug it may be a VPU bug but I didn't have time to check it further. The nature of this bug is that if you try to rotate 90 or 270 degrees the image will have a part which is encoded as a green block (artifact) Rotating 180 degrees or mirroring works at least for h264 encoding. Please note that for rotating VPU requires that the rotation angle be supplied before the encoder is created. Mirroring can be specified per frame. I hope all these changes are not too much

dv1 commented 8 years ago

Thanks, I will review this today.

I am inclined to leave out the rotation parts, since the VPU documentation recommends against using the VPU rotator. In any case, if it is introduced, it should be a separate followup commit. But do not worry about this for now - I have set up everything so I can review and merge here, so I'll take care of that.

dv1 commented 8 years ago

Okay, this will take longer than expected. I see some potential issues with the code, but have to check further. Since I will be busy with other stuff, I expect this to be done early next week.

dv1 commented 8 years ago

First patch reworked and pushed. Note that the API is different to yours. Most notably, the write callback isn't passed a PTS-DTS pair, but a pointer to the encoded frame instead, which is more generic.

As for the streaming mode, I wonder if it is really necessary to manually keep track of the GOP. Isn't information about the stream available earlier? For example, ideally, right before the first writeout we could call vpu_EncGetOutputInfo() and get the necessary info to decide whether or not a header has to be inserted first.

scabot commented 8 years ago

I am confused. Isn't vpu_DecGetOutputInfo only applicable to decoding? But even if you meant EncGetOutputInfo - It should be called once the frame is done and the resulting structure doesn't contain any information for you to use. If I understand correctly we need to insert headers before an I frame so it doesn't help knowing that the encoded frame was an I frame we need to know before hand that the encoded frame is going to be an I frame. But maybe I am missing something?

dv1 commented 8 years ago

Yes, indeed. My hope was that vpu_EncGetOutputInfo could be called before already. It isn't explicitely disallowed in the docs, but it does appear like undefined behavior.

scabot commented 8 years ago

I actually think the documentation is clear about when to call it:

From the docs:

This function gives the information about the encoding output such as the picture type, the address and size of the generated bitstream, the number of generated slices, the end addresses of the slices, and the macroblock bit position information. The host application should call this function after frame encoding is complete and before starting further processing.

(The bold emphasis is mine)

On the other hand, except for MJPEG where headers should be inserted before each frame, it may be legal to push headers for H264 at arbitrary positions in the bit stream and not only before I frames. I am not sure about this and unfortunately I don't have time to check.

dv1 commented 8 years ago

However, it says "should" - and also does not describe what happenes if you call this function before the frame is fully encoded. Nevertheless, I won't rely on undocumented behavior and use your approach.

scabot commented 8 years ago

I started going over the code. I just want to make sure, right now you committed only the changes that enable the write style callback? You still didn't commit the ringbuffer stuff? Thanks

dv1 commented 8 years ago

Yes, that's what I meant by "first patch". The ringbuffer stuff comes next and needs more work. I am trying out some ideas I've had for quite a while.

scabot commented 8 years ago

Thanx

dv1 commented 8 years ago

Unfortunately I am very busy with other projects again. Several days ago I split up your commits and pushed them. The changes are now in, except for the last one (where streamed encoding via ringbuffer is introduced). I observed very strange h.264 encoding behavior when the ringbuffer mode was on. A stray 0x00000001 Annex.B start code is inserted into the bitstream for unclear reasons. Also, the header insertion needs to be done differently.

Also, in ring buffer mode, the header generation code will not work properly, since the VPU will also use the ring buffer mode for the vpu_EncGiveCommand() calls that insert header data, meaning that if the write pointer is not at the start, then old header data will be preserved.

I think that in ring buffer mode, it is not even necessary to generate headers at the start of the encoding. Since they are just added to the ringbuffer at the write pointer position, we can just call vpu_EncGiveCommand() during encoding to insert the headers. In fact, the whole idea with the imx_vpu_enc_generate_header_data() function was that in line buffer mode, vpu_EncGiveCommand() doesn't behave this way..

Still, even if I fix the header generation, there is this odd stray start code. Perhaps a VPU bug?

scabot commented 8 years ago

Hi

I am quite busy myself.

I may be mistaken but no where in the code we actually use the vpu_EncGiveCommand to insert header data except in when initiating the encoder. We generate the header data only once at the beginning and the copy it or write it without regenerating. Am I missing something? Did you actually see a malformed stream? Maybe this is the bug that we don't actually generate headers for each I frame?

On Mon, Aug 15, 2016 at 10:24 PM, dv1 notifications@github.com wrote:

Unfortunately I am very busy with other projects again. Several days ago I split up your commits and pushed them. The changes are now in, except for the last one (where streamed encoding via ringbuffer is introduced). I observed very strange h.264 encoding behavior when the ringbuffer mode was on. A stray 0x00000001 Annex.B start code is inserted into the bitstream for unclear reasons. Also, the header insertion needs to be done differently.

Also, in ring buffer mode, the header generation code will not work properly, since the VPU will also use the ring buffer mode for the vpu_EncGiveCommand() calls that insert header data, meaning that if the write pointer is not at the start, then old header data will be preserved.

I think that in ring buffer mode, it is not even necessary to generate headers at the start of the encoding. Since they are just added to the ringbuffer at the write pointer position, we can just call vpu_EncGiveCommand() during encoding to insert the headers. In fact, the whole idea with the imx_vpu_enc_generate_header_data() function was that in line buffer mode, vpu_EncGiveCommand() doesn't behave this way..

Still, even if I fix the header generation, there is this odd stray start code. Perhaps a VPU bug?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Freescale/libimxvpuapi/pull/13#issuecomment-239902112, or mute the thread https://github.com/notifications/unsubscribe-auth/AAmzViOgqxrIgrgUK2yZYMKh7HLkYi5hks5qgLzwgaJpZM4Il5Vi .

dv1 commented 8 years ago

Finally got some time to investigate more. It seems that in ringbuffer mode, the first PPS NALU is truncated - only its AUD start code is left. This starts to look like a VPU bug to me. For this reason, I won't include this feature in the 0.10.3 release, and reserve it for the release after that.

otavio commented 7 years ago

@dv1 how does the review of this going? it seems it does not have a release for a while...

dv1 commented 6 years ago

I am updating this for the imx8 support. As part of that, I am looking into ringbuffer support as well.

angolini commented 5 years ago

What is the status of this PR?

Maybe we can close it and reopen in future if needed. What do you think?

Freescale / libimxvpuapi

Add support for ring buffer mode #13