chromeos / cros-libva

BSD 3-Clause "New" or "Revised" License
23 stars 4 forks source link

buffer: add some encoder buffers types #3

Closed bgrzesik closed 1 year ago

bgrzesik commented 1 year ago

Hi! :wave: Let's start introducing encoding structures to cros-libva. I did divergence from existing buffer types in few places. Namely use of Option<> in EncSequenceParameterBufferH264. Also I would appreciate any feed back on CodedBuffer<>. Please share your thoughts on those. Thanks!

dwlsalmeida commented 1 year ago

Hey @bgrzesik

Give me a couple of days to have a look on this, @Gnurou will probably chime in soon as well most likely

bgrzesik commented 1 year ago

Ideally, at some point you should provide a small demo to make sure that your design works. See:

@dwlsalmeida just added a simple mostly hardcoded demo, based on your suggestion PTAL :smile:

dwlsalmeida commented 1 year ago

@bgrzesik I am experimenting a bit with some minor changes in your code. Mainly I don't want to expose VAEncBufferType. I want to try and upload the data through Images instead of userptr as well. Only problem is that the output is not quite right yet.

Once I debug why that is broken you can squash the changes and we can proceed with this, overall it looks good to me.

bgrzesik commented 1 year ago

@dwlsalmeida You're referring to VAEncCodedBufferType. Right? Please let me know of your findings :smiley:

Also I added some of the buffers for HEVC, VP8 and VP9. I can also move them to a separate PR if you'd like.

dwlsalmeida commented 1 year ago

@bgrzesik Some changes:

VAEncCodedBufferType cannot be created by clients directly anymore. They must be created from a PictureSync that has called upload_yuv in the past.

USERPTR was removed in favor of upload_yuv. This follows the examples in libva-utils, and also is the same thing that GStreamer does, basically. Note that the previous code would only upload the yuv during surface creation, which would not work for pooled surfaces.

@Gnurou Alex, please note that f8dc08e reintroduces VAImage writeback. This was somehow removed by 36d76c1, but doing so breaks the vaGetImage/vaPutImage pair that is needed for encoding.

I also took the liberty to fix the fact that we could previously have a Image outlive its Surface (!!!) by introducing a &'a Surface field. It is also needed for writeback.

dwlsalmeida commented 1 year ago

@bgrzesik can you also add a crc check to the h264 encode demo please? Your current code will pass even if the actual result is not right. In fact that happened a few times with me :/

Also there are some clippy warnings.

Gnurou commented 1 year ago

Another idea that could simplify things quite a bit:

Right now we are creating the Picture from the source Surface, and then upload the YUV data using an Image obtained from that picture (while also allocating coded_buffer). This is somehow required because currently we can only create Images from Pictures.

But technically, we should be able to create Images just from a Surface, and gaining this ability would let us upload the YUV data into the surface before passing it to the Picture constructor. That way we don't need to care about the state of the Picture since it doesn't exist yet.

It would also play more nicely with zero-copy encoding, since in that scenario Surfaces are created from a DMABuf and there is no upload step.

This also means we have no place to allocate coded_buffer anymore. :sweat_smile: But let me think some more about that one.

bgrzesik commented 1 year ago

@dwlsalmeida Thank you for your changes! Now I get what you meant there. However I agree with @Gnurou regarding additional field, use of assert/unwrap and DMABufs.

I can think of one more approach. Modify a Picture definition and use generics to differentiate between picture used for decoding and encoding. This way we could expose creating CodedBuffer only during encoding and move creation of VAEncCodedBufferType to instantiation of Picture rather then upload_yuv.

Like so

pub struct Picture<S: PictureState, T, E> {
    inner: Box<PictureInner<T, E>>,
    phantom: std::marker::PhantomData<S>,
}

struct PictureInner<T, E> {
    // ...
    buffer: E,
}

type DecodePicture<S: PictureState, T> = Picture<S, T, ()>;

type EncodePicture<S: PictureState, T> = Picture<S, T, Buffer>;

However this seems rather a bit too stretched.

@bgrzesik can you also add a crc check to the h264 encode demo please? Your current code will pass even if the actual result is not right. In fact that happened a few times with me :/

I am not sure if we can rely on encoder to always produce the same bitstream especially between different devices and therefore I am afraid we'd have to decode this slice to do so. But I agree there is something wrong going on, because the size of the coded segment changes between runs. And I wonder if there it is missing some synchronization on input frame. I am planning to take a deeper look at this.

Thank you! :smile:

bgrzesik commented 1 year ago

Also one small nit. I think it would be clearer if dropped referring to image raw data as YUV, since some encoders could accept RGB as well :sweat_smile:

dwlsalmeida commented 1 year ago

Another idea that could simplify things quite a bit:

Right now we are creating the Picture from the source Surface, and then upload the YUV data using an Image obtained from that picture (while also allocating coded_buffer). This is somehow required because currently we can only create Images from Pictures.

But technically, we should be able to create Images just from a Surface, and gaining this ability would let us upload the YUV data into the surface before passing it to the Picture constructor. That way we don't need to care about the state of the Picture since it doesn't exist yet.

It would also play more nicely with zero-copy encoding, since in that scenario Surfaces are created from a DMABuf and there is no upload step.

This also means we have no place to allocate coded_buffer anymore. 😅 But let me think some more about that one.

How do we know whether

a) we need to create and map a coded buffer b) how do we make sure that somehow the client has inserted the raw data into the underlying surface c) how can we make sure that CodedBuffers are only created after a sequence of begin(),render(),end() and sync() ?

By the way, IMHO, I think this fix should be considered independent of this PR:

please note that [f8dc08e](https://github.com/chromeos/cros-libva/pull/3/commits/f8dc08ecda4083e7b9f5da07f620a5900e2d706e) reintroduces VAImage writeback. This was somehow removed by https://github.com/chromeos/cros-libva/commit/36d76c1c0504e153d3081d7538182d5c6778dce4, but doing so breaks the vaGetImage/vaPutImage pair that is needed for encoding.

I also took the liberty to fix the fact that we could previously have an Image outlive its Surface (!!!) by introducing a &'a Surface field. It is also needed for writeback.

Supporting vaGetImage() + vaPutImage is a pretty key thing that we are currently missing.

bgrzesik commented 1 year ago

But I agree there is something wrong going on, because the size of the coded segment changes between runs. And I wonder if there it is missing some synchronization on input frame. I am planning to take a deeper look at this.

And so I did. I couldn't find any reason behind that. However the issue is gone when using Image to upload the data. Furthermore when I switched to VAEntrypointEncSlice endpoint to it was also gone. So I guess we should stick with either.

Gnurou commented 1 year ago

b) how do we make sure that somehow the client has inserted the raw data into the underlying surface

This is the responsibility of the user, a surface could already have frame data associated with it (e.g. surfaces created from DMAbufs) so ultimately it's up to the caller to create an image from their Surface if needed (we need to provide an API for that).

c) how can we make sure that CodedBuffers are only created after a sequence of begin(),render(),end() and sync() ?

That's probably an API issue, we have the same problem for decoded frames when we decode a stream.

By the way, IMHO, I think this fix should be considered independent of this PR:

Agreed, let me come with a PR containing your 2 CLs plus another one that allows create Images from Surfaces, this should give a better basis for adding encoder support.

Gnurou commented 1 year ago

Posted the CLs directly - so now one can create Images from Surfaces, which should make the frame upload process simpler for encoding.

bgrzesik commented 1 year ago

Posted the CLs directly - so now one can create Images from Surfaces, which should make the frame upload process simpler for encoding.

Should I modify the demo code to take advantage of that new API? This would require to add a way to allocate picture's coded buffer or modify upload_yuv. Or leave as is? :smile:

dwlsalmeida commented 1 year ago

@bgrzesik i am on vacation for 2 weeks and I don't have my laptop around, I can provide more feedback when I return

Gnurou commented 1 year ago

Should I modify the demo code to take advantage of that new API? This would require to add a way to allocate picture's coded buffer or modify upload_yuv. Or leave as is? 😄

I have pushed a change that removes upload_yuv and the coded_buffer member of Picture and replaces them with a regular upload through an Image. Feel free to rebase/reorganize the CL chain into a more coherent set as we are piling changes on top of each other.

As for how to manage the coded buffer's lifetime, I'd say let's just make the client responsible for it (i.e. basically what my change does). Worst case, if it gets destroyed too early, libva will return an error when we process the picture. I'll try to come with a better solution but for now that will do.

bgrzesik commented 1 year ago

Feel free to rebase/reorganize the CL chain into a more coherent set as we are piling changes on top of each other.

True :smile: Done

dwlsalmeida commented 1 year ago

Hi @bgrzesik @Gnurou

I am back. This PR looks good (again thanks @bgrzesik ), I left only one small comment above

dwlsalmeida commented 1 year ago

I am fine with the latest changes as suggested by @Gnurou and implemented by @bgrzesik

Alex, this PR looks good to me. We should merge.

Gnurou commented 1 year ago

Merged! Thank you.