Open dwlsalmeida opened 1 year ago
IIRC the point of using AsRef<[u8]>
was to allow any input type that can provide a reference to the stream as input. Switching to Cow
would mean that the data now must be in a Vec<u8>
, which is probably going to be the case 99% of the time indeed, but might force us to perform a copy if that condition is not met.
Would using Cow
allow us to avoid a copies in cases where AsRef<[u8]>
makes us do one? That would be the case for using Cow
in the code. Otherwise I suspect we can remove all the generics if we force NaluReader
to with with a &[u8]
and remove its own generic. Since all NaluReader
s are build and used locally, I think that should work. https://github.com/chromeos/cros-codecs/pull/51 does this, is that all the simplification you had in mind?
@Gnurou Hi Alex!
By switching to Cow, we remove the T as you did in #51, but users can switch to an owned variant of the slice/frame/obu etc through to_owned
if they so desire by paying for the copy.
Note that the input is still &[u8], so they can still use the borrowed versions cost free, which is what we currently do anyways.
This will also remove this <[u8]>
from many parts of the code such as
/// Called to dispatch a decode operation to the backend.
#[allow(clippy::too_many_arguments)]
fn decode_slice(
&mut self,
picture: &mut Self::Picture,
slice: &Slice<&[u8]>, <-----------------------
sps: &Sps,
pps: &Pps,
dpb: &Dpb<Self::Handle>,
ref_pic_list0: &[DpbEntry<Self::Handle>],
ref_pic_list1: &[DpbEntry<Self::Handle>],
) -> StatelessBackendResult<()>;
To summarize:
a) we keep the number of copies equal b) we provide a convenience for a price (as the parsers can be used by any other project, in theory) c) we clean up our code
Thanks for the clarifications. A few remarks:
The current AV1 parser does not need to use Cow
- i.e. line 107 that you quoted can just as well be
pub data: &'a [u8],
And the code builds just as fine. Users that need their own version of the data can use e.g. Vec
's From<&[u8]>
to achieve the same effect. Cow
is typically used in scenarios where you might need to change some borrowed data (and thus create your own copy) at runtime. This is not the scenario we are facing with the parsers, where the data is always static.
That being said, the base idea is still valid and we should consider using it in other parsers - only we probably just need to replace AsRef<[u8]>
by &'a [u8]
.
Both Cow<'a, [u8]>
and &'a [u8]
introduce a lifetime, so we would need to make sure that this works well with the existing parsers, but if AV1 can accomodate it I suppose the others could too.
So I tried to implement this a bit and I think I understand what you want - I have updated PR #51 to include the change to &'a [u8]
. Please let me know if that's what you had in mind.
Introducing Cow<'a, [u8]>
would not simplify things any further IMHO since the lifetime would still be there anyway.
@Gnurou in terms of removing the generics, that was mostly what I had in mind, yes. As for Cow:
The current AV1 parser does not need to use Cow
Oh it doesn't need to, yes. It's just an added convenience for users. I explain better below.
Users that need their own version of the data can use e.g. Vec's From<&[u8]> to achieve the same effect.
Not sure I understand. For example:
pub struct Frame<'a> {
/// The bitstream data for this frame.
bitstream: &'a [u8],
You can't have an owned version of Frame
. Even Using Vec::From<&[u8]>
would not help, because
a) that will create an owned version of the bitstream, not an owned version of Frame
.
b) the resulting Vec
can contain data for multiple frames.
The above assumes that you mean using Vec::from
passing in bitstream
here:
/// Parse a single frame from the chunk in `data`.
pub fn parse_frame<'a>(&mut self, bitstream: &'a [u8]) -> anyhow::Result<Frame<'a>> {
Introducing Cow<'a, [u8]> would not simplify things any further IMHO since the lifetime would still be there anyway.
True you can't get rid of lifetimes, but you can get owned Frames/Slices/Tiles etc with Cow if you want through to_owned()
.
To make my point a bit more clear (sorry for being a bit verbose here)
let frame: Cow<'static, u8> = Frame {
bitstream: frame.bitstream.to_owned(),
..frame
};
With the design I introduced in the AV1 parser, this will also only clone the part of the bitstream that generated the parsed Frame/slice/tile/etc, as we subslice the bitstream when creating the Cow:
Ok(Some(Obu {
header,
data: Cow::from(&data[..start_offset + obu_size]),
start_offset,
size: obu_size,
}))
}
Mmm one thing I don't quite understand is, since the lifetime remain (and thus an owned Frame
will be subject the the same lifetime as a borrowed one), and we don't modify the bitstream anyway, what is the point of making a copy of the bitstream data?
@Gnurou
since the lifetime remain (and thus an owned Frame will be subject the the same lifetime as a borrowed one)
Correct me if I am wrong, but if you call to_owned
, you get Frame<'static>
, so no.
Playground link to corroborate my point above: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=950ef57fd86788da7a22901353696124
and we don't modify the bitstream anyway, what is the point of making a copy of the bitstream data?
Our code still creates a borrowed version e.g.:
data: Cow::from(&data[..start_offset + obu_size]),
So no copies are being made. Again, a mere convenience if anyone wants to have an owned Frame/slice/tile, at the cost of copying the data. Frankly, #51 also works, because it does away with the generic parameter already. If you want to keep it at that (i.e.: if you feel Cow is not worth the trouble here), then I am also OK with that.
One thing that comes to mind is this excerpt from the h.265 VAAPI code:
#[derive(Default)]
pub struct BackendData {
// We are always one slice behind, so that we can mark the last one in
// submit_picture()
last_slice: Option<(
SliceParameterBufferHEVC,
Option<SliceParameterBufferHEVCRext>,
Vec<u8>, <---------
)>,
va_references: [PictureHEVC; 15],
}
Look how we have to keep a Vec
let slice_data = Vec::from(slice.nalu().as_ref());
self.replace_last_slice(slice_param, slice_param_ext, slice_data);
Which, again, stems from our inability to have an owned Slice type. I think that would benefit from the Cow approach, for example.
I am experimenting with a new design for the AV1 code. One that uses
Cow<'_, [u8]>
in place ofT: AsRef<[u8]>
. This means that we can remove thisT: AsRef<[u8]>
noise from a lot of functions and traits, while making it possible to still buildSlices
,Frames
and etc out of both borrowed and owned data without copying.What's more,
Cow<'_, [u8]>
dereferences to&[u8]
so nothing fundamentally changes in the code as far as I can see. Not only that, but we de facto do not even support anything other than&[u8]
in the decoders, as introducingT
in our backend traits would make them not object safe. UsingCow<'_, [u8]>
circumvents this problem, as there's no generics involved and lifetimes do not break object safety.If the current AV1 design proves successful, we should maybe backport these changes to the other codecs as a way to clean up the codebase by a considerable amount.