Closed sraghunathan closed 5 years ago
The database switchover duration is ~30sec, during this time database is not available
If you configure FreeRADIUS to use a database, then it tries to use the database. Taking the database down means that FreeRADIUS will have problems.
The PostgreSQL client APIs try to connect to the database, and block if the database is unavailable. This is due to underlying API / network issues.
i.e. if the database is unreachable, then the underlying TCP connections will quickly fail, and FreeRADIUS will not block. If the TCP connections are up but are unresponsive, then the PostgreSQL client library will block waiting for more data, and FreeRADIUS will block waiting for the PostgreSQL client library to return.
If you want FreeRADIUS to use a database, make sure that the database is responsive.
FreeRADIUS includes it's own failover mechanism, precisely because of problems like this. Most failover systems are broken and terrible. Whereas the FreeRADIUS failover system is under our own control.
The solution here is to rely on the FreeRADIUS failover mechanism. Don't play magic network games.
DB query getting hang during database switchover in AWS
Defect
Issue we are facing is during the database switchover freeRADIUS insert query is getting hanged , even after successful database switchover freeRADIUS is in hung state and not processing any other messages, query is not getting timeout.
We tried to perform the config change of query_timeout = 5 in /etc/raddb/mods-enabled/sql but its not reflecting during the radiusd -X
To solve this issue we have manually restart the service.
Observation
If there is no RADIUS traffic during this database switchover, there is no issue observed. After successful switchover , if we start the traffic its working fine from the new database. Kindly note, database hostname is same for both the DB. database switchiover is triggered by a DNS change for the hostname.
How to reproduce the issue
Create 2 database (both in sync) in two different instance. Create the DNS A record with 1st database which will act as primary. Perform the switch over based on changing the DNS A record. Observe how freeRADIUS behave during the DB change
Output of
[radiusd|freeradius] -X
showing issue occurring