Closed suffieldacademy closed 2 years ago
Brief update. We've peeled back as much of the configuration as possible, down to a single netfilter (not iptables) instance named "default", so we're as simple a setup as possible.
We are now seeing syslog entries from the primary host:
joold: Received a packet from kernelspace.
joold: Sending 280 bytes to the network...
joold: Sent 280 bytes to the network.
However, the failover machine is failing to process them, but is logging:
joold: Received 280 bytes from the network.
joold: Error receiving packet from kernelspace: Invalid input data or parameter
I'm not much of a C programmer, but looking for that error in the source brings me to usr/joold/modsocket.c, but I can't figure out much more from there.
We are isolating the forwarding interfaces, joold, etc all inside a network namespace (netns) as shown in the documentation. Is there any known odd behavior with netns and joold? Otherwise, I'm not sure why the packets aren't being processed.
Now that I found that more specific error, I see this is referenced in #362. I am running 4.1.5 on Debian stable, so I will try to upgrade to a more recent version and see if I can unravel this further.
OK, re-constituted the full multi-instance setup under v4.1.8 and having much better luck. Apologies for not starting with the most recent release, but usually try to stick to the Debian repos.
Sorry for the noise, but sometimes typing it all out helps me work through it!
Hello,
We're busy setting up a multi-node multi-instance NAT64 cluster. Most of the pieces are in place, but we're having trouble with the session synchronization. I believe we've followed the instructions, but we are not seeing the session states from one node showing up on the other. Additionally, we're seeing session traffic from multiple instances even though we're only creating sessions in a single instance. I'm wondering if there might be an issue with per-instance sessions.
Quick background on the setup:
Both nodes have the following instances defined:
Both NAT64 instances have joold configured to run session synchronization:
The instances use the same multicast destination, but different ports:
The two nodes are directly connected to each other via ethernet. We have not assigned an addresses; only link-local IPv6 are automatically assigned.
When we start the instances and generate traffic, the translation is occurring. On the node that is translating the traffic, we see session entries being generated:
Additionally, on the failover host we see the multicast packets arriving on the interface with the correct multicast destination and port number.
However, we are not seeing any sessions being created in the failover host (the session table is empty). Is there any debugging or other information we can enable to try to find where the packets might be getting lost?
One other oddity that we noticed is that even though our test traffic is only going through a single instance on the primary box, BOTH instances are generating session sync traffic. This happens even if we set ss-enabled=false on one of the instances (traffic is still generated to both ports). I'm wondering if perhaps joold is receiving session updates for all instances and forwarding them, rather than only propagating changes for a particular instance.
However, even if that were the case, I'm not sure why the other instances aren't seeing any sessions arrive (I would instead expect to see too many if all instances were generating duplicate traffic).