[BUG] ARP sends packets over the wrong network interfaces.

evpopov commented 9 months ago

Describe the bug When multiple interfaces and endpoints are present, ARP leaks out one NetIf's IPv4 addresses over the other NetIfs.

Target

ALL

To Reproduce ARP_Leakage Check out "before.pcapng" in the attached zip file The DUT send out Gratuitous ARP's which in the capture look like ARP requests. the protocol addresses for these ARP requests are 10.238.238.10 and 192.168.21.47. Those last two addresses belong to the Fake0 interface and should not finding their way to different interfaces.

Expected behavior After a quick fix, "after.pcapng" from the .zip file shows the resulting ARP packets. I'm pretty sure this should be the correct behavior. where we only see ARP packets associated with 172.30.30.26

Wireshark logs ARP_Leakage.zip

I have a fix for this and was planning on starting a PR. Before I do that though, I'd like to have a discussion on the issue. Here are my thoughts and questions:

Do you agree that this is a bug. I strongly believe it is. I have never seen a host do this and I can't come up with even a theoretical reason for this behavior... but I have been wrong many times before, so please let me know if you think I'm missing something.
The offending code is FreeRTOS_OutputARPRequest() which sends out an ARP request on all available IPv4 endpoints. My first quick and dirty fix was to use FreeRTOS_FindEndPointOnNetMask() inside FreeRTOS_OutputARPRequest() and only send to the found end-point if there was one. This approach removes a for(;;) loop and one if() clause from FreeRTOS_FindEndPointOnNetMask() making it substantially cleaner, but I think there might be an even better fix...
Another solution ( and I think that is the better one ) would be to add a NetworkEndPoint_t * pxEndPoint parameter to FreeRTOS_OutputARPRequest() I need to look deeper into this solution, but here's my reasoning: FreeRTOS_OutputARPRequest() has one very specific function and almost every function that calls it already know which endpoint must be used. I believe FreeRTOS_OutputARPRequest() should not be burdened with searching for end-points. I have looked at all calls to FreeRTOS_OutputARPRequest() and all of them except two already know the endpoint that should be used so they can simply give it to FreeRTOS_OutputARPRequest().
There are 2 exceptions to the above:
xARPWaitResolution() calls FreeRTOS_OutputARPRequest() without knowing the proper endpoint, BUT..... xARPWaitResolution() is never used so it's optimized out, AND even if it were used, I believe it needs to use FreeRTOS_FindEndPointOnNetMask() and absorb the burden of the search before calling FreeRTOS_OutputARPRequest() with the proper endpoint already figured out.
The second exception to (3) is prvTCPPrepareConnect_IPV4() As is the case with (5) I believe prvTCPPrepareConnect_IPV4() should find the end-point before calling FreeRTOS_OutputARPRequest()
I am willing to create a PR for the more elaborate solution listed in (3)
I can even have both solutions as draft PRs and we can compare them before choosing what to do

Let me know what you think and thank you for your time.

paulbartell commented 9 months ago

@evpopov Thanks for reporting this bug. I will discuss with the team and we will get back to you some time next week.

HTRamsey commented 9 months ago

@evpopov I've noticed this and a few other issues with ARP & ND as well that I will address right after I finish #1092 and another branch I have adjusting icmpv6 handling. FreeRTOS_OutputARPRequest probably has no use internally and should just be public. An internal function like vNDSendNeighbourSolicitation is needed for ARP.

evpopov commented 9 months ago

@HTRamsey I'm sorry but I'm having a hard time understanding what you mean. Are you saying that your future work is going to make this issue report obsolete? If so, I can close it. I'm not sure what you mean by saying that FreeRTOS_OutputARPRequest should be made public. It already is public. If FreeRTOS_OutputARPRequest has no use internally, are you saying that you will be replacing it with a new function similar to vNDSendNeighbourSolicitation or maybe vNDSendNeighbourSolicitation itself will replace FreeRTOS_OutputARPRequest? I'm sorry, I'm confised.

HTRamsey commented 9 months ago

@evpopov I mean FreeRTOS_OutputARPRequest shouldn't be really need to be used internally anywhere, but left just to be used as a public api function only (It requires no knowledge of buffers or endpoints so it's simple to use and cleans up after itself). There should be an internal "vARPSendRequest" or something that sends to the supplied endpoint rather than iterating all endpoints and passes control of the buffer instead of creating a bunch of new buffers. And functions like FreeRTOS_OutputAdvertiseIPv6 and everything that calls pfOutput needs to check xIsCallingFromIPTask.

Adding a NetworkEndPoint_t * pxEndPoint parameter would work, but I don't think the user should be screwing around with the network endpoint structs after FreeRTOS_IPInit_Multi is called. Actually I think FreeRTOS_AddEndPoint really should make copies instead of pointing to user given data. Just have something like ipconfigNUM_ENDPOINTS/ipconfigNUM_INTERFACES to allow internal static allocation. Then have safe API functions to allow the user to make any changes from there.

Also there's a lot of 'pick the best random endpoint' which I don't really think is a safe thing to do.

evpopov commented 9 months ago

Oh, I see! To be honest with you, I didn't even realize FreeRTOS_OutputARPRequest is part of the API and the user can use it. I guess I didn't pay attention to the "FreeRTOS_" prefix at all. Why would we allow the user to output ARP messages directly?!?!? I don't think it's a good idea at all.

That changes things.... First of all, I completely agree that functions called by the user should not have an endpoint as a parameter. Second, adding a parameter to FreeRTOS_OutputARPRequest means I would be breaking someone's code which is not ideal. That leaves option 2 from above: Calling FreeRTOS_FindEndPointOnNetMask from within FreeRTOS_OutputARPRequest.

htibosch commented 9 months ago

Why would we allow the user to output ARP messages directly?!?!? I don't think it's a good idea at all.

It is safe to use the function from the IP-task as well as from a user context because it checks for xIsCallingFromIPTask().

The reason that I added it is that people asked for it. And under some circumstances I use it my self.

Calling FreeRTOS_FindEndPointOnNetMask() would be good to avoid double code.

htibosch commented 9 months ago

Holden wrote:

Also there's a lot of 'pick the best random endpoint' which I don't really think is a safe thing to do.

The first examples of configurations that I used 10 years ago, were 100% deterministic:

Endpoint-1: class C
    IP    : 192.168.2.5
    Mask  : 255.255.255.0
    GW    : 192.168.2.1

Endpoint-2: Class B
    IP    : 172.16.0.15
    Mask  : 255.255.0.0
    GW    : 0.0.0.0

Everything that fits with the network mask can be used. The function FreeRTOS_InterfaceEndPointOnNetMask() can determine the endpoint with certainty, it was used everywhere in the code.

In the beginning, we even added checks to avoid overlapping network addresses.

And then there were requests on the FreeRTOS forum to give total freedom, even creating two endpoints with the same IP address. Or having the same IP-address behind different endpoints. The programmers that I spoke were not able to change these plans. They had to write software in an existing network plan.

So today, we have a bit of a heritage from these choices for freedom. Now in stead of making quick changes to the endpoint handling, I would rather make a study of what things could be better, before implementing anything. At the local scope, a change may seem very reasonable, but in an existing application, it may cause unexpected trouble.

shubnil commented 9 months ago

Hi @evpopov , Thanks for bringing this out. We had a discussion internally on this. Sending ARP request to all the endpoints under on the same physical interface is fine, however if there are more than one physical interfaces it's a problem if ARP request from one interface is sent to the other. And we agree that the existing function "FreeRTOS_OutputARPRequest" is not Interface aware. To handle this we have the below proposal.

Create a new API named "FreeRTOS_OutputARPRequest_multi" which also takes in "pxNetworkInterface" as an argument
Retain the existing "FreeRTOS_OutputARPRequest" API for backward compatibility but add an Interface aware code inside.
New users can use the new API and the existing users can continue to use the existing API or move to the new API.

Code Snippet: New API: void FreeRTOS_OutputARPRequest_multi( uint32_t ulIPAddress, NetworkInterface_t * pxNetworkInterface);

Modifications in existing API: void FreeRTOS_OutputARPRequest( uint32_t ulIPAddress )

Find interface from ip-address using the function FreeRTOS_InterfaceEndPointOnNetMask
Call the new API FreeRTOS_OutputARPRequest_multi

Please share your thoughts on the same.

Thanks, Shub

evpopov commented 9 months ago

First of all, Thank you all for the input!

@shubnil, Patching FreeRTOS_OutputARPRequest is super easy. I already have a branch with this and it performed well in my tests. I will make a PR with that change. If discussions lead us in a different directions, I can always update the PR or delete it altogether.

Please help me understand your reasoning behind not adding a parameter. And I'm not arguing here, I'm simply trying to understand and learn. Why add an entire new function when adding a parameter will allow all use cases of FreeRTOS_OutputARPRequest to continue functioning the same. YES, as soon as we add a parameter, somebody's code will quit compiling, but after they pass NULL for the new parameter, their code will compile AND function as before. Is this what you are trying to avoid? I personally would call such change "backwards compatible" but I may be wrong with that label. Are there use cases where FreeRTOS+TCP is dynamically linked, so user code can call functions without being aware that the parameters have changed?

I am more and more thinking that adding a new function is not really needed. It will need to be unit-tested, and whether we actually need it should come AFTER some kind of an evaluation of "how arp/icmpv6" resolution should be done. The more I think about it, the more I feel like we are going to add FreeRTOS_OutputARPRequest_multi now and then after some thinking and organizing ( what @htibosch and @HTRamsey are talking about ) we will realize we shouldn't have added this function or we should have implemented it differently.

@htibosch I absolutely, wholeheartedly agree that we should not rush to solutions. This is also why I'm a bit hesitant to add a new function like FreeRTOS_OutputARPRequest_multi. That would be a quick and not very well though out approach. However it appears that we all agree that FreeRTOS_OutputARPRequest is doing something wrong. I think we should do two things:

Fix the bug, because it is a bug and there is a pretty safe and quick solution.
Do some kind of evaluation or study of how things SHOULD be done. In other words, define the general rules and guidelines that specify a unified approach to endpoints, interfaces, routing, filtering, and lookups. That in my mind is very very important, but it will take a while and the bug-fix in (1) does not interfere with such and evaluation.

@HTRamsey If I were to create a PR that simply patches FreeRTOS_OutputARPRequest, would that interfere with anything you are doing or planning on doing? It sounds to me that in the future, that patch may not be needed, but that's OK in my opinion. Let me know.

I will keep this issue open while the discussion is active.

HTRamsey commented 9 months ago

@evpopov

would that interfere with anything you are doing or planning on doing?

Nope, go ahead.

Also in regards to the FreeRTOS_OutputARPRequest_multi, we could also just change the existing void FreeRTOS_OutputARPRequest( uint32_t ulIPAddress ) to void FreeRTOS_OutputARPRequest_multi( uint32_t ulIPAddress, NetworkInterface_t * pxNetworkInterface) and then do #define FreeRTOS_OutputARPRequest( ulIPAddress ) FreeRTOS_OutputARPRequest_multi( ulIPAddress, NULL ) which is what the kernel usually does anyways. Then it works either way and you only have one actual function. I personally would have done the same thing with all the functions in FreeRTOS_Routing.c that are defined under ipconfigCOMPATIBLE_WITH_SINGLE.

FreeRTOS / FreeRTOS-Plus-TCP

[BUG] ARP sends packets over the wrong network interfaces. #1088