Closed meadmaker closed 6 years ago
Here are the system logs between when the response contained useful information and when it didn't:
==> /var/log/trust_router/default.log.201806010130 <==
Configuration file change detected, waiting for changes to settle.
==> /var/log/messages <==
Jun 1 01:32:38 ip-172-31-47-134 trust_router[4673]: Configuration file change detected, waiting for changes to settle.
==> /var/log/trust_router/default.log.201806010130 <==
Configuration file change settled, attempting to update configuration.
tr_parse_config_files: Attempting to load revision 1527816757 of './trusts.cfg'.
tr_parse_config_files: Attempting to load './internal.cfg'.
tr_print_config: Logging running trust router configuration.
tr_print_config: Community apc.sports.psec.us:
tr_print_config: - Member IdPs:
tr_print_config: - @IDP realm: "apc.sports.psec.us" shared: no local: yes AAA servers: apc.qa.painless-security.com:12309 APCs: apc.sports.psec.us
tr_print_config: - @IDP realm: "redsox.com" shared: no local: yes AAA servers: 3ac1b3-red-sox.tid.qa.painless-security.com:12309 APCs: apc.sports.psec.us
tr_print_config: - @IDP realm: "royals.com" shared: no local: yes AAA servers: 8a0d78-royals.tid.qa.painless-security.com:12309 APCs: apc.sports.psec.us
tr_print_config: - @IDP realm: "whitesox.mlb.us" shared: no local: yes AAA servers: 51623a-white-sox.tid.qa.painless-security.com:12309 APCs: apc.sports.psec.us
tr_print_config: - Member RPs:
tr_print_config: - @RP realm: "mark.littleleague.org"
tr_print_config: - @RP realm: "nymets.com"
tr_print_config: - @RP realm: "redsox.com"
tr_print_config: - @RP realm: "royals.com"
tr_print_config: - @RP realm: "whitesox.mlb.us"
tr_print_config: - @RP realm: "apc.sports.psec.us"
tr_print_config: - @RP realm: "apc.sports.psec.us"
apc.sports.psec.us | apc.sports.psec.us | | 0 | tr.qa.painless-security.com:12308 | :-1 | 1 | 1 | 1970-01-01 00:00:00 UTC | 0
apc.sports.psec.us | redsox.com | | 0 | tr.qa.painless-security.com:12308 | :-1 | 1 | 1 | 1970-01-01 00:00:00 UTC | 0
apc.sports.psec.us | royals.com | | 0 | tr.qa.painless-security.com:12308 | :-1 | 1 | 1 | 1970-01-01 00:00:00 UTC | 0
apc.sports.psec.us | whitesox.mlb.us | | 0 | tr.qa.painless-security.com:12308 | :-1 | 1 | 1 | 1970-01-01 00:00:00 UTC | 0
>> Membership table start <<
* idp apc.sports.psec.us/apc.sports.psec.us
null origin (0x1db59e0) - prov:
* idp redsox.com/apc.sports.psec.us
null origin (0x1db5aa0) - prov:
* idp royals.com/apc.sports.psec.us
null origin (0x1db5be0) - prov:
* idp whitesox.mlb.us/apc.sports.psec.us
null origin (0x1db5d20) - prov:
* rp mark.littleleague.org/apc.sports.psec.us
null origin (0x1db5ee0) - prov:
* rp nymets.com/apc.sports.psec.us
null origin (0x1db60a0) - prov:
* rp redsox.com/apc.sports.psec.us
null origin (0x1db6260) - prov:
* rp royals.com/apc.sports.psec.us
null origin (0x1db6420) - prov:
* rp whitesox.mlb.us/apc.sports.psec.us
null origin (0x1db6620) - prov:
* rp apc.sports.psec.us/apc.sports.psec.us
null origin (0x1db6820) - prov:
Configuration updated successfully.
==> /var/log/messages <==
Jun 1 01:32:44 ip-172-31-47-134 trust_router[4673]: Configuration file change settled, attempting to update configuration.
Jun 1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_parse_config_files: Attempting to load revision 1527816757 of './trusts.cfg'.
Jun 1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_parse_config_files: Attempting to load './internal.cfg'.
Jun 1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_print_config: Logging running trust router configuration.
Jun 1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_print_config: Community apc.sports.psec.us:
Jun 1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_print_config: - Member IdPs:
Jun 1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_print_config: - @IDP realm: "apc.sports.psec.us" shared: no local: yes AAA servers: apc.qa.painless-security.com:12309 APCs: apc.sports.psec.us
Jun 1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_print_config: - @IDP realm: "redsox.com" shared: no local: yes AAA servers: 3ac1b3-red-sox.tid.qa.painless-security.com:12309 APCs: apc.sports.psec.us
Jun 1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_print_config: - @IDP realm: "royals.com" shared: no local: yes AAA servers: 8a0d78-royals.tid.qa.painless-security.com:12309 APCs: apc.sports.psec.us
Jun 1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_print_config: - @IDP realm: "whitesox.mlb.us" shared: no local: yes AAA servers: 51623a-white-sox.tid.qa.painless-security.com:12309 APCs: apc.sports.psec.us
Jun 1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_print_config: - Member RPs:
Jun 1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_print_config: - @RP realm: "mark.littleleague.org"
Jun 1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_print_config: - @RP realm: "nymets.com"
Jun 1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_print_config: - @RP realm: "redsox.com"
Jun 1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_print_config: - @RP realm: "royals.com"
Jun 1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_print_config: - @RP realm: "whitesox.mlb.us"
Jun 1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_print_config: - @RP realm: "apc.sports.psec.us"
Jun 1 01:32:44 ip-172-31-47-134 trust_router[4673]: apc.sports.psec.us | apc.sports.psec.us | | 0 | tr.qa.painless-security.com:12308 | :-1 | 1 | 1 | 1970-01-01 00:00:00 UTC | 0#012apc.sports.psec.us | redsox.com | | 0 | tr.qa.painless-security.com:12308 | :-1 | 1 | 1 | 1970-01-01 00:00:00 UTC | 0#012apc.sports.psec.us | royals.com | | 0 | tr.qa.painless-security.com:12308 | :-1 | 1 | 1 | 1970-01-01 00:00:00 UTC | 0#012apc.sports.psec.us | whitesox.mlb.us | | 0 | tr.qa.painless-security.com:12308 | :-1 | 1 | 1 | 1970-01-01 00:00:00 UTC | 0
Jun 1 01:32:44 ip-172-31-47-134 trust_router[4673]: >> Membership table start <<#012* idp apc.sports.psec.us/apc.sports.psec.us#012 null origin (0x1db59e0) - prov: #012#012* idp redsox.com/apc.sports.psec.us#012 null origin (0x1db5aa0) - prov: #012#012* idp royals.com/apc.sports.psec.us#012 null origin (0x1db5be0) - prov: #012#012* idp whitesox.mlb.us/apc.sports.psec.us#012 null origin (0x1db5d20) - prov: #012#012* rp mark.littleleague.org/apc.sports.psec.us#012 null origin (0x1db5ee0) - prov: #012#012* rp nymets.com/apc.sports.psec.us#012 null origin (0x1db60a0) - prov: #012#012* rp redsox.com/apc.sports.psec.us#012 null origin (0x1db6260) - prov: #012#012* rp royals.com/apc.sports.psec.us#012 null origin (0x1db6420) - prov: #012#012* rp whitesox.mlb.us/apc.sports.psec.us#012 null origin (0x1db6620) - prov: #012#012* rp apc.sports.psec.us/apc.sports.psec.us#012 null origin (0x1db6820) - prov:
Jun 1 01:32:44 ip-172-31-47-134 trust_router[4673]: Configuration updated successfully.
It looks like this happened after its configuration file was updated. (I'm guessing that the update was effectively nothing but the serial number changing, since I wasn't using the portal at the time.)
I'm having trouble reproducing this by changing the serial number - I get a proper response even after half a dozen or so reloads.
Ok, just got it to happen.
This is happening because the callbacks that handle various monitoring commands pass a pointer to the trps
instance that they need to get data from. When a configuration file is reloaded, a new trps
is instantiated, but the monitoring handler still gets the old pointer, resulting in unpredictable behavior.
Edit: hmm, that's not quite right, but it's something like that. Getting closer...
Aha- it's easier than that. After loading a new peer table, none of the peers have had connection attempts until the next connection update time. There was a check in the monitoring message encoder that suppressed setting the "last_connection_time" field if it had not yet been set. The encoder was set to fail if that field was missing, so it failed until an attempt was made.
So to begin with, this was a transient issue (as soon as a connection attempt was made, it should work). In fact, it would have happened upon the first load except that the first connection attempts are made right after startup, so this is not likely to have been seen. The fix is simply to omit that field if it's irrelevant.
We could also make it say "none" or something, but I think it's easier for someone consuming the monitoring JSON message not to have to deal with a field that is sometimes a UTC timestamp and sometimes a differently formatted string. If the field is absent, there have been no attempts.
Should be fixed by 2cafd9dbafa262d1d29ce0af0590c3aa36b527fe
I'm closing this. Reopen if it comes up again.
After a short while of running, the query for trustrouter peering:
Yields empty JSON:
... despite having yielded a response of peers earlier.