PVA's port field in search request makes it not NAT/firewall friendly

EmilioPeJu commented 1 month ago

I would like to report a use-case that has problems with PVA, and though the problem is more a protocol-related problem, I couldn't find an issues page for just the protocol (not a specific implementation), so I assumed this was the best place to do it.

The use-case is running a container with a PVA server and exposing 5075-5067 to the host, this expose mechanism usually involves some NATing, if we send a search request from the host, it starts as: 127.0.0.1:49155 -> 127.0.0.1:5076 with payload specifying Port: 49155 The network plug-in converts that into something like: 172.20.255.250:33851 -> 172.20.255.250:5076 with payload specifying Port: 49155 and then, the PVA server tries to respond to port 49155 instead of the NAT-ed one (33851). Because the network plug-in doesn't know anything about that port, it fails and obtains a ICMP destination unreachable message. Please keep in mind this is not only container-specific, this will be a problems for any NAT or firewall doing something similar.

FYI: @coretl and @gilesknap

EmilioPeJu commented 1 month ago

A possible solution for this is changing the PVA protocol to have a special value for the port field (say 0) which the server would interpret as "answer to the port in the UDP header", this is something that is already done for the address, i.e. address 0.0.0.0 indicates the server to answer to the IP in the header.

mdavidsaver commented 1 month ago

The handoff from UDP to TCP is a feature of both CA and PVA protocols. I don't think there is any getting around the fact that UDP and TCP port numbers are in effect different namespaces.

One place where this is possible is in the case of search over TCP. When there are only TCP port numbers involved a special (0) value would make sense. I think the main complication would be in coordinating a minor protocol version increment. Existing clients do not recognize zero as special, and would blindly try to connect to port zero.

@kasemir fyi.

EmilioPeJu commented 1 month ago

Hi @mdavidsaver and sorry I didn't express myself properly. I wasn't talking about handoff from UDP to TCP, I was referring only to the search request and reply, all happening fully using UDP... there is a big difference between CA and PVA in that PVA specifies the port in which the search requester (or client) wants to receive the search reply. To be more specific, PVA spec document defines a search request as:

struct searchRequest {
  int searchSequenceID;
  byte flags;
  byte[3] reserved;
  byte[16] responseAddress;
  short responsePort;
  string[] protocols;
  struct {
    int searchInstanceID;
    string channelName;
  } channels[];
  };

The field this issue is about is responsePort, I cannot think of a case in which it is useful. The same applies to responseAddress but in my traffic capture I can see it is ignored and just set to 0.0.0.0 (actually the IPv6 representation of it). Those two fields make even less sense if the search request is sent over TCP.

mdavidsaver commented 1 month ago

The field this issue is about is responsePort

Ah. So this a duplicate of #159?

I agree that allowing these indirect replies is a bad design. PVXS and core.pva servers do what may be done compatibly to be friendly to stateful firewalls matching request with reply. This does not cover NAT though. Fully eliminating this misfeature would require a protocol version increment. Possible, but tedious enough that it hasn't happened so far.

EmilioPeJu commented 1 month ago

No, this is not a duplicate of #159 , that issue is talking about the source port of the search response packet being a random one, however, this issue is about the search request packet specifying a response port which will become the destination port in the response packet.

P.S. An extra detail I just noticed, we didn't have the problem in #159 because we were talking to a PVA gateway (which doesn't use this PVA implementation). The problem described in the current issue(197) is common to both implementations though.

mdavidsaver commented 1 month ago

not a duplicate of #159

Ok. So a different consequence of the same protocol design decision.

Fully eliminating this misfeature would require a protocol version increment. Possible, but tedious enough that it hasn't happened so far.

On further reflection, I'm not sure that a minor (compatible) protocol increment would work. Testing the minor version works when it can be negotiated between client and server. aka. over TCP connections after the initial handshake, or with a UDP reply.

With a UDP request, the sender has no idea of the protocol minor versions (likely plural) supported by the recipients.

So I think it would be an incompatible change to start sending SEARCH requests with responsePort==0, regardless of protocol minor version.

Right now, the only way I can think of directly "fixing" this issue would be to introduce a new, second, search request message format. Maintaining compatibility would then require that clients concurrently send both messages. This would double the bandwidth used, which in 2024 is I think probably not an issue. Although I expect some would disagree with me on this.

A second option, which I like far less, would be to introduce handling of responsePort==0 on RX now, with the idea of "eventually" starting to send it.

As a note: responsePort is also necessary to implement (what I call) the local multicast "hack", where the recipient of a unicast UDP search will re-send it via. multicast to 127.0.0.1 to reach all PVA peers. This is how PVA avoids the problems CA has with unicast search to hosts with multiple IOC processes. I think this could be accommodated by appending the origin port to the ORIGIN_TAG message prefixed to forwarded messages.

EmilioPeJu commented 1 month ago

Thanks @mdavidsaver for the information, I think the server side can be fixed (in both implementations) without breaking any old version (given that older clients will never use 0 as responsePort), regarding the client side, from all the options you mentioned, I think sending 2 search requests seems less problematic, and to be less invasive, this behavior could be enabled by setting some environment variable (for example EPICS_PVA_ALLOW_NAT=yes)

anjohnson commented 1 month ago

I would be one of those people who isn't keen on the idea of sending out duplicate searches.

This may be the same problem that we are hoping to solve by putting a PVA name-server in between the two networks. Is that something which could be done with containers too?

kasemir commented 1 month ago

solve by putting a PVA name-server in between

Maybe that's a better approach.

With containers, we've said for a while that you need to use --network=host. The OP said this isn't container-specific, NAT in general will cause problems. Well, yes. Network infrastructure may be aware of the http protocol and can accordingly patch URLs. But firewalls etc. don't understand PVA and won't update the responseAddress & Port. So do we require a name server or PVA gateway to go across such network infrastructure?

mdavidsaver commented 1 month ago

With containers, we've said for a while that you need to use --network=host ...

Maybe "need" is too strict. Although imo. doing otherwise is asking for some avoidable pain.

gilesknap commented 1 month ago

Maybe "need" is too strict. Although imo. doing otherwise is asking for some avoidable pain.

In preparing for the EPICS collaboration meeting I have gone to some effort to stop using network=host for IOCs in order that we can run a workshop with lots of people using the same PVs on the same network. I've been meaning to get around to this for some time.

I have been entirely successful in doing this by running all the IOCs plus one ca-gateway in the same container network and having the ca-gateway bind to the CA ports on the Host (using the loopback adapter for local development - but this could also be used to bind to an actual NIC).

Next, I tried to use the PVA plugin to show images out of Areadetector. I could not achieve the same thing with PVAGW and asked Emillio to help me diagnose it. That is when we found the issue that Emillio reported here.

This may be the same problem that we are hoping to solve by putting a PVA name-server in between the two networks. Is that something which could be done with containers too?

This would work for containers as long as the port that the request comes in on is the port that you should reply to. If at any point the protocol requires passing port numbers in the application layer, any NAT will fail (including container network NATs).

mdavidsaver commented 1 month ago

Next, I tried to use the PVA plugin to show images out of Areadetector. I could not achieve the same thing with PVAGW ...

You might be able to make this work with a pair of gateways communicating by TCP only. As I sometimes do with SSH tunneling. aka. EPICS_PVA_NAME_SERVERS=.... Then I think all of the port numbers would be under your control.

Although, looking more closely, I realize that I have PVXS ignoring responseAddr and responsePort in SEARCH received over TCP.

https://github.com/epics-base/pvxs/blob/5fa743d4c87377859953012af3c0fbcd1b063129/src/serverchan.cpp#L183

gilesknap commented 1 month ago

Although, looking more closely, I realize that I have PVXS ignoring responseAddr and responsePort in SEARCH received over TCP.

My interpretation of this is that it would work with PVXS because TCP just replies back to where the SEARCH came from. The PVGW I'm using is from P4P so this would work?

It's not ideal because instead of having everything running nicely inside of a container with no host installs required, we now need a gateway outside too (if I'm interpreting you correctly). --edit-- Maybe not. Both gateways could be in containers, one running in host network the other in the shared container network.

epics-base / pvAccessCPP

PVA's port field in search request makes it not NAT/firewall friendly #197