pion / rtp

A Go implementation of RTP
https://pion.ly/
MIT License
349 stars 111 forks source link

AV1 samplebuilder support (WIP) #264

Open jech opened 5 months ago

jech commented 5 months ago

An attempt to implement the required hooks for the samplebuilder to work with AV1. This approach is likely to be both faster and simpler than the one in #238.

COMPLETELY UNTESTED, DO NOT COMMIT.

alexpokotilo commented 5 months ago

1) func (p AV1Packet) IsPartitionHead(payload []byte) bool is called before func (p AV1Packet) Unmarshal(payload []byte) ([]byte, error) { and hence you cannot use !p.Z && p.N and should use something like

if len(payload) == 0 {
    return false
}
return (payload[0] & byte(0b10000000)) == 0

I was not able to test further as IsPartitionHead always false

jech commented 5 months ago

Fixed. However, as I've mentioned, this is completely untested, and I'd be very surprised if it worked. The question is whether you agree that this could work in principle, since it is both more efficient and simpler than what you propose.

codecov[bot] commented 5 months ago

Codecov Report

Attention: Patch coverage is 0% with 31 lines in your changes are missing coverage. Please review.

Project coverage is 84.70%. Comparing base (74a9dc7) to head (89c2b75).

Files Patch % Lines
codecs/av1_packet.go 0.00% 31 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #264 +/- ## ========================================== - Coverage 86.21% 84.70% -1.52% ========================================== Files 22 22 Lines 1734 1765 +31 ========================================== Hits 1495 1495 - Misses 203 234 +31 Partials 36 36 ``` | [Flag](https://app.codecov.io/gh/pion/rtp/pull/264/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pion) | Coverage Δ | | |---|---|---| | [go](https://app.codecov.io/gh/pion/rtp/pull/264/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pion) | `84.70% <0.00%> (-1.52%)` | :arrow_down: | | [wasm](https://app.codecov.io/gh/pion/rtp/pull/264/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pion) | `84.24% <0.00%> (-1.51%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pion#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

alexpokotilo commented 5 months ago

Fixed. However, as I've mentioned, this is completely untested, and I'd be very surprised if it worked. The question is whether you agree that this could work in principle, since it is both more efficient and simpler than what you propose.

AppendToSample implementation is wrong. There are maybe several OBUS in AV1Packet and AppendToSample should run the similar loop as func (p *AV1Packet) Unmarshal

But there is much simplier approach much better than my and yours: just return

OBULen+OBU list right from Unmarshal I checked and nobody cares about func (p *AV1Packet) Unmarshal. Who use AV1Packet without samplebuilder, they use AV1Packet for it's OBUelements map. They don't care about payload format at all.

All my fix was to workaround current useless av1 format. your AppendToSample introduce to keep current format but to produce right format after Unmarshal. Shouldn't we just return all we need from Unmarshal. Yes, we need to allocate for sample but your implementation do the same, but for only one OBU. This is wrong. There maybe multiple OBUs in Av1Packet. I'd bet to return correct av1 bitstreama directly from Unmarshal

jech commented 5 months ago

AppendToSample implementation is wrong.

Interesting.

There are maybe several OBUS in AV1Packet

That's right, and my implementation builds a sample as a sequence of LE128-prefixed OBUs. Are you suggesting a different representation?

I checked and nobody cares about func (p *AV1Packet) Unmarshal.

https://github.com/jech/galene/blob/master/codecs/codecs.go#L48

This code does the equivalent of AV1Packer.Unmarshal in an efficient manner. I had to write it because Unmarshal was written by people who don't care about efficiency.

Shouldn't we just return all we need from Unmarshal.

Unmarshal is a low-leve function that should run in constant time and perform no allocations. Check the H264 version of Unmarshal, it's done right, it doesn't try to parse the list of NALUs. Please also read the comments in https://github.com/pion/rtp/pull/265 and https://github.com/pion/rtp/pull/266.

jech commented 5 months ago

@alexpokotilo Another point is that you cannot reassemble fragmented OBUs in Unmarshal, since the packets haven't been reordered at this point: it's the samplebuilder's (or the jitter buffer's) job to reorder packets. So there's really no alternative to doing the parsing later than at Unmarshal time.

alexpokotilo commented 5 months ago

AppendToSample implementation is wrong.

Interesting.

There are maybe several OBUS in AV1Packet

That's right, and my implementation builds a sample as a sequence of LE128-prefixed OBUs. Are you suggesting a different representation?

@jech, maybe my problem is that I see AppendToSample implementation but don't see its usage in your branch. Could you please finish your version so I can just use it and check it really support all cases. Now I see a part of job. I don't understand how AppendToSample do the same OBUs processing as Unmarshal because you didn't add AppendToSample call to sample builder. If you call it in loop till all data is processed then ok, I got the point. If you call it without loop like you pointed in your https://github.com/pion/rtp/pull/238#issuecomment-2056512863 example I don't get the point then. Again speaking about remained fragment of last OBU. I don't see how you are going to process it.

Long story short. You are right and my version is not ideal. I don't like it either. When I did it I tried to close the gap in AV1Packet implementation. I had to use AV1Frame to correctly process last remained OBU as AV1Packet drops OBUElements on unmarshal and don't save last not complete OBU. For example H264Packet save fragmented nal for further processing. AV1Packet doesn't do that so your AppendToSample will not work.

From performance pov it doesn't matter where you process OBUs: in unmarshal method or in AppendToSample right next to it. I can see AV1Packet is used only in samplebuffer and some tests. if we use AppendToSample or return processed data from unmarshal the total performance will be the same. You will not be able to eliminate allocation in unmarshal if you add fragmented obu support there. Take a look H264Packet. If you have fuaBuffer from previous packet then "p.fuaBuffer = append(p.fuaBuffer, payload[fuaHeaderSize:]...)" performaed. and this is allocation.

Now AV1Packet just drop OBUElements on every unmarshal. it doesn't care about previous fragments. I had to use AV1Frame to combine OBUs from AV1Packets as AV1Packet doesn't care about it.

What you propose is :

Don't get me wrong please. I admit my initial approach is wrong and I'm starving to test your version but it doesn't exist.

I still don't see a problem in adding tail support into AV1Packets the same way H264Packet do and return necessary format from it without adding new interface AppendToSample. at the end of the day you will have to allocate somewhere. Why not in unmarshal ? What so special in this method so that you don't want to do the job there? these Packet structures are used mainly in samplebuilder. You don't add any gain moving processing in new method AppendToSample.

If you complain about unmarshal-only approach only from design POV - I'm sorry I don't completely confident to speak from design POV. I'm not Packet approach designer. But If you think your approach gain something to samplebuffer users - please complete it and I will create performance test to check where you are right or not.

I think it's time to complete AppendToSample so I can do tests and check whether it works or not. Now I cannot and don't see how I can help to prove your points or argue with them.

My main problem with AppendToSample approach is that I don't get what is so special with av1 so we need to handle it different way from h264? do we have tails in h264? yes, we do. How we process them? in unmarshal. Do we do allocation in h264's unmarshals? Yes, in case we have previous tail. Does h264' unmarshal return ready to append buffer? Yes, it is.

You want to have lightweight header parser without all body processing - go ahead but this is a different story. I don't see how you improve performance by moving processing into AppendToSample and call it right after unmarshal. What samplebuffer users gain from this ?

Do I agree that AV1Packet made in different way than others Packets? Yes. Should we add tail support in it to not rely on Av1Frame for assembling? 100% Should we add AppendToSample to offload unmarshal? For what ? Should we remove OBUElements processing? Well yes, but Av1Frame users will not be happy right now. this is legasy, We can add flag like @Sean-Der proposed in one of your pulls.

This is all I can say about it.

alexpokotilo commented 5 months ago

@alexpokotilo Another point is that you cannot reassemble fragmented OBUs in Unmarshal, since the packets haven't been reordered at this point: it's the samplebuilder's (or the jitter buffer's) job to reorder packets. So there's really no alternative to doing the parsing later than at Unmarshal time.

Please look at H264Packet. it does process tails from previous unmarshal call relying on the fact that samplebuffer call unmarshal in correct order. This is what I see in the code at least. Check func (p *H264Packet) Unmarshal(payload []byte) ([]byte, error) and it's p.fuaBuffer processing. If unmarshal didn't rely on packet order this approach would not work for h264.

jech commented 5 months ago

Please look at H264Packet. it does process tails from previous unmarshal call relying on the fact that samplebuffer call unmarshal in correct order.

Yes, and that's wrong, for at least three reasons:

Alex, let us please do the right thing in AV1, and let's fix H.264. I've started here:

at the end of the day you will have to allocate somewhere. Why not in unmarshal ? What so special in this method

Because Unmarshal is not optional: if you want to work with a packet, you must call Unmarshal. Any other methods are optional, you only call them if you need the functionality.

Pion is designed to be suitable for many different applications, from full-featured multimedia applications (that do encoding, decoding and so on) to fast media servers (that want to push massive amounts of packets around while paying as little overhead as possible. We must endeavour to make sure that Pion remains widely usable, and doesn't become optimised for one specific class of applications.

alexpokotilo commented 5 months ago

Please look at H264Packet. it does process tails from previous unmarshal call relying on the fact that samplebuffer call unmarshal in correct order.

Yes, and that's wrong, for at least three reasons:

* people expect Unmarshal and Marshal should be (at least logically) inverses; if you defragment at Unmarshal time, you loose this property, which causes subtle bugs;

* there is no place where it is documented that Unmarshal must be colled in sequential orde (the VP8 and VP9 formats don't require it), so the requirement causes subtle bugs;

* there are applications that want to call Unmarshal just to check header flags, these applications do not want to pay the whole price of parsing the OBU or NALU list.

So maybe add Another method to help these application and keep current interface to others ? I respect your passion and willing to change things you consider not right. I just don't see anything wrong with the way it is for h264 and others. I vote to fix AV1Packet. If you can convince maintainers then you are right. I don't see any reason for this. But I could be wrong

lebedyncrs commented 5 days ago

@jech any plans to continue on this one? If not can you please sum up what needs to be done? maybe somebody take over

jech commented 5 days ago

Yes, I'm planning to work on it. I'm actually planning a fairly comprehensive rework of the packet code:

Unfortunately, I won't be able to work on that before the end of October.

lebedyncrs commented 4 days ago

Yes, I'm planning to work on it. I'm actually planning a fairly comprehensive rework of the packet code:

  • add a new interface, so that the zero-alloc mode has feature-parity with the (currently default) wasteful mode;
  • modify the sample-builder to work with the zero-alloc mode;
  • add support for the new interface to the AV1 codec (zero-alloc mode only, since I'm planning to deprecate the wasteful mode);
  • convince Sean to make zero-alloc the default (incompatible change).

Unfortunately, I won't be able to work on that before the end of October.

Sounds exciting, please let me know if you would need help with testing. Does this PR is going to solve https://github.com/pion/rtp/issues/273 ?