cisco / libsrtp

Library for SRTP (Secure Realtime Transport Protocol)
Other
1.19k stars 469 forks source link

stream_index #674

Closed jmillan closed 5 months ago

jmillan commented 6 months ago

Enable stream index for fast stream retrieval

Rationale

A linked list:

libsrtp does orders of magnitude more list traversal than additions or removals, hence a linked list is not a proper way to store them.

The solution in this PR is keeping a stream index in an array such as:

Details

Performance

In my tests I find this solution being >6 times more performant than current master branch.

master

Screenshot 2024-01-04 at 15 32 29

stream-index

Screenshot 2024-01-04 at 15 32 45

Others

IMHO the linked list may perfectly be removed in favour of the steam index, but as in this PR, they can perfectly coexist.

I'm letting libsrtp maintainers decide to adapt and apply this code to their will.

NOTE: This is currently enabled via a flag. I believe config files will require some work for the option to be properly added.

jmillan commented 5 months ago

This will still iterate over elements serially.

But has nothing to do in terms of performance for the reasons given in the description.

Previously, you had proposed a hashing method that I think would have subdivided the search space. Do you think that idea might still be useful along with this?

Yes, the hashing would go further in terms of speed for large environments. I did not even think of implementing it here as it was rejected in the first place in my previous PR.

Anyway the hashing could be added in a different PR to make things more clear.

What if we use this array approach here but preface it with a 256-member bucket of pointers.

This was a memory concern in my hashing PR, even though I created a hash with 32 buckets https://github.com/cisco/libsrtp/pull/659#issuecomment-1838649171

My opinion is that everything commented here can be done sequentially in the corresponding PRs:

pabuhler commented 5 months ago

@jmillan, hi, I am not sure if you noticed but in the main branch we have now started ABI breaking changes and will try to get a libSRTPv3 out some time this year, so if this is intended for 2.x then the PR should be based of and towards the 2_x_dev branch. I dont mind which you prefer, the difference is the main branch will be API unstable for a while so can readily make changes to existing AIP's etc, where as the 2_x_dev branch can not break compatibility.

I kind of agree with your actions points, but I think I would have preferred this to already be a separate srtp_list implementation rather than a mix, you could use the same config to select if this new list or the old list is complied in, what do you think?

I also think that adding the hash bucket would be a good idea, especially if we support dynamically growing the number of buckets and the size of the arrays.

jmillan commented 5 months ago

Hi,

@jmillan, hi, I am not sure if you noticed but in the main branch we have now started ABI breaking changes and will try to get a libSRTPv3 out some time this year, so if this is intended for 2.x then the PR should be based of and towards the 2_x_dev branch.

I don't have a big preference, honestly.

I think I would have preferred this to already be a separate srtp_list implementation rather than a mix, you could use the same config to select if this new list or the old list is complied in, what do you think?

Sounds good. I've just pushed this.

I also think that adding the hash bucket would be a good idea, especially if we support dynamically growing the number of buckets and the size of the arrays.

A hash has a fixed number of buckets unless we want to move entries around when incrementing them, which we don't.

jmillan commented 5 months ago

@pabuhler

This is the current state:

jmillan commented 5 months ago

@paulej, we are on time to remove the old list. We can perfectly remove it now and get rid of the configuration option.

Otherwise I would like to ask anyone to check whether the new configuration option is properly added and applied for all needed cases.

pabuhler commented 5 months ago

@paulej, we are on time to remove the old list. We can perfectly remove it now and get rid of the configuration option.

Otherwise I would like to ask anyone to check whether the new configuration option is properly added and applied for all needed cases.

If we a merging straight in to main then maybe just drop the configuration and remove the old list code. If we remove the old code then should also remove the next / prev pointers in the strp_stream_ctxt struct ?

If you keep the config then we will need to add it to cmake as well and should have a ci build that uses the config with at least one of the build systems. Also the config for autotools, ie "configure" should have been added to configure.ac, the configure script is generated from that. I would have also renamed the define to start with SRTP_ .

Just removing the old code is by far the least work :)

jmillan commented 5 months ago

Just removing the old code is by far the least work :)

Done :-)

jmillan commented 5 months ago

Fuzzer fails. I'll check it.

pabuhler commented 5 months ago

Fuzzer fails. I'll check it.

ok, do you get any where ? been a while since I have used the fuzzer locally, but I know we might need to update it if we are changing API's etc.

jmillan commented 5 months ago

ok, do you get any where ? been a while since I have used the fuzzer locally, but I know we might need to update it if we are changing API's etc.

I can reproduce it. This PR does not change the API so it may be something introduced here. I'll check into it as soon as I can.

jmillan commented 5 months ago

The first problem I'm seeing is that two stream list entries, and hence two streams, with the same SSRCs are created:

Screenshot 2024-01-11 at 16 30 02

Is this even legal? Should libsrtp have 2 streams with the same ssrc? Can anyone please confirm this?

master branch does nothing to avoid that and neither does this branch.

I would say that should never happen and hence srtp_insert_or_dealloc_stream() or srtp_stream_list_insert() should check if there is already a stream with such ssrc rather than blindly insert it.

jmillan commented 5 months ago

I've enhanced srtp_stream_list_for_each() so it does not consider ssrc in order to increase the loop iterator but rely on the list availability instead. This is needed because this method is used to iterate and remove the list entries.

The fuzzer should not fail now (It does not fail in my local env at least). But the previous question still needs to be answered, whether having multiple entries with the same SSRC should be permitted (IMHO NOT, but I need confirmation). This is something that master allows now BTW.

pabuhler commented 5 months ago

The fuzzer should not fail now (It does not fail in my local env at least). But the previous question still needs to be answered, whether having multiple entries with the same SSRC should be permitted (IMHO NOT, but I need confirmation). This is something that master allows now BTW.

I thought it was documented as undefined behavior, otherwise a full traversal of the list would be required for each insert. So no it is not allowed. In which case the fuzzer might need updating. see: https://github.com/cisco/libsrtp/blob/fcc2517479dcb162a24da83740c156de5920f6c9/include/stream_list_priv.h#L88

jmillan commented 5 months ago

The fuzzer should not fail now (It does not fail in my local env at least). But the previous question still needs to be answered, whether having multiple entries with the same SSRC should be permitted (IMHO NOT, but I need confirmation). This is something that master allows now BTW.

I thought it was documented as undefined behavior, otherwise a full traversal of the list would be required for each insert. So no it is not allowed. In which case the fuzzer might need updating. see: https://github.com/cisco/libsrtp/blob/fcc2517479dcb162a24da83740c156de5920f6c9/include/stream_list_priv.h#L88

OK , yes, makes sense. Then we are good πŸ‘

pabuhler commented 5 months ago

If you are ready then I will merge, and thanks for your effort and patience :)

jmillan commented 5 months ago

If you are ready then I will merge, and thanks for your effort and patience :)

πŸ‘let me just fix the comment and make a couple more manual tests tomorrow.

jmillan commented 5 months ago

@pabuhler,

Ready to merge πŸ‘

saghul commented 5 months ago

πŸ™Œ

jmillan commented 5 months ago

@jmillan, hi, I am not sure if you noticed but in the main branch we have now started ABI breaking changes and will try to get a libSRTPv3 out some time this year, so if this is intended for 2.x then the PR should be based of and towards the 2_x_dev branch.

Considering this branch has been merged into main, this will be part of v3, right? Is main still considered stable?

Do you think it makes sense to merge this branch into 2_x_dev too?

BTW, I've noticed the branch has been merged without rebasing all commits into a single one. I would recommend doing so, you can force it from GH.

pabuhler commented 5 months ago

Yes this will be part of v3, it is possible to merged into the 2_x_dev branch if required. There is the option of replacing the list in 2.5 so I do not think it is critical.

main is not API stable for the foreseeable future but in terms of functionality and performance it should remain stable.

In general I think it is up the the PR author to decide to rebase / squash before merging but no hard and fast rules around that. In this case I agree that squashing it down would have probably been a good idea and I can take some blame for not suggesting it. In the end the diff was quite small and I forgot to look at commit history.

jmillan commented 5 months ago

I was about to use the current 'main' branch in our project in order to get this enhancement. We are using meson, and for that we need to point the github .zip file for the exact commit or release of the subproject (libsrtp in this case). The problem is that github does not guarantee that the .zip file hash will remain unchanged for non release commits.

Having said this, we should point to a release hash rather than to a non release one. The question is then, when are you planning to release 2.5? If this will happen sooner than later then it could be interesting having this into 2.5 if possible.

pabuhler commented 5 months ago

@jmillan this was why I asked where you intended to push it :) . I think we will try to release 2.6 during February as it is a year since last time and hopefully the last 2_x release. This change can be cherry picked on to the 2_x_dev branch with a few tweaks, the question might come up if it should be behind config or not as there is more risk in 2_x .

jmillan commented 5 months ago

Update: We have forked libsrtp, created a release from the current main and started using it in mediasoup. This way we can start using this enhancement now and consequently verify its stability.

pabuhler commented 5 months ago

@jmillan that sounds like a good idea, look forward to hearing any feedback and hopefully we get to a v3 release soon.

jmillan commented 4 months ago

This change can be cherry picked on to the 2_x_dev branch with a few tweaks, the question might come up if it should be behind config or not as there is more risk in 2_x .>

mediasoup has been using this enhancement for more than a month and it's been working flawlessly.

NOTE: I don't see a reason for this to be behind a config.

jmillan commented 4 months ago

I wonder if we should remain in main or move to 2_x, once this enhancement is moved. Do you have any consideration on that regard?

pabuhler commented 4 months ago

@jmillan the API in main is in a state of flux at the moment so that alone is a reason to not pull from there until a v3 is closer to release.

Personally I prefer to spend time on trying to get v3 out, rather than porting this change to the 2_x_dev branch, but if someone creates a PR I will help review and get it merged.