Open NikolayShakin opened 1 month ago
@NikolayShakin it's quite important that the system (on which rtpengine is running), is supposed to have an actual DNS record for the concerned FQDN asap. Which can be sometimes cumbersome to update in-time, when a switchover takes just a few seconds. Also it gets not so easy when NAPTR/SRV records are used for the record. So it will be useless to force rtpengine to re-resolve the FQDN, if the record is still the same during the switchover/failover of redis master. Hence the must is that the system provides an actual record just in-time.
However, I think it should be feasible to add a resolve of the FQDN, each time when rtpengine gets re-connected to redis. I will give a look in coming weeks.
@zenichev I'm a bit confused about your statement - can you please expand on the difference between "force rtpengine to re-resolve the FQDN" and "the system provides an actual record just in-time"?
The way I understand the situation, the libnss hosts
driver (or another mechanism that uses the local system configuration in /etc/nsswitch.conf
or /etc/resolv.conf
directlry) is quarried for the IP address for the host defined in the configuration. The problem the OP describes (which is very similar to the issue I have) is that first the DNS records - and hence the local system configuration (not the RTPEngine configuration file) has been updated with a new IP address for the existing host name, and only then the older server - whose IP address RTPEngine has used to connect is dropped.
The request is that when the redis connection drops, only then is RTPEngine expected to refresh its cache of the DNS results - under the assumption that the rest of the system is working correctly.
@guss77
can you please expand on the difference between "force rtpengine to re-resolve the FQDN" and "the system provides an actual record just in-time"?
Maybe wasn't too much clear, but important is to have a correctly behaving system, where rtpengine is running, in regards of host names resolution.
The problem the OP describes (which is very similar to the issue I have) is that first the DNS records - and hence the local system configuration (not the RTPEngine configuration file) has been updated with a new IP address for the existing host name
This wasn't clear from the original request. Updated where ? on the DNS server, or on the system where rtpengine is running? As I said, we should assume that system correctly and in-time has actual resolution for the requested hostname. Then we can act in rtpengine, and upon loosing connection to the redis server, try to make a resolve.
It smells like that should be configurable option. I will try to put my hands when I find free time coming weeks. Not promising anything.
Maybe wasn't too much clear, but important is to have a correctly behaving system, where rtpengine is running, in regards of host names resolution.
Agreed.
This wasn't clear from the original request. Updated where ?
The OP said "[…] and DNS record was updated", which I read to map well into my situation: the DNS configuration on the local system can resolve the same FQDN to the new IP address, immediately when it makes sense to see the change.
It smells like that should be configurable option.
I disagree - it should be the default and only behavior: do not cache the DNS result and when a new connection needs to be opened - run the resolver again. I don't think there is even a performance consideration here: it is up to the system administrator to make sure that the DNS lookup is prompt (and there are many many ways to do so), and if it isn't - its not RTPEngine's problem.
The thing is, even when the system that runs rtpengine
can resolve the IP address correctly, rtpengine
doesn't try to do so when it fails to connect to Redis(to the old incorrect address), it keeps trying to connect the IP address it resolved when started.
Is your feature request related to a problem? Please describe
When using
rtpengine
with aredis
cluster after the redis master node has changed, thertpengine
losesredis
connection even when FQDN is used as theredis
server address and DNS record was updated.Describe the solution you'd like
When we try to re-establish connection to
redis
we can re-resolve IP address after every N failed tries to connect. It will not be too expensive as we lost the connection anyway. Assuming that DNS record was updated when theredis
master node changed, it will allowrtpengine
automatically switch to a newredis
master nodeDescribe alternatives you've considered
Restart
rtpengine
afterredis
master node changeThe rtpengine version you checked that didn't have the feature you are asking for
Version: 12.5.1.5-1~bpo11+1