painless-security / trust-router

Moonshot Trust Router
0 stars 0 forks source link

trustrouter peer monitoring generates empty response #97

Closed meadmaker closed 6 years ago

meadmaker commented 6 years ago

After a short while of running, the query for trustrouter peering:

trmon tr.qa.painless-security.com 12310 show peers 

Yields empty JSON:

gss_connect: Connecting to host 'tr.qa.painless-security.com' on port 12310
{
    "code": 0,
    "message": "success",
    "payload": {
        "peers": []
    }   
}   

... despite having yielded a response of peers earlier.

meadmaker commented 6 years ago

Here are the system logs between when the response contained useful information and when it didn't:

==> /var/log/trust_router/default.log.201806010130 <==
Configuration file change detected, waiting for changes to settle.

==> /var/log/messages <==
Jun  1 01:32:38 ip-172-31-47-134 trust_router[4673]: Configuration file change detected, waiting for changes to settle.

==> /var/log/trust_router/default.log.201806010130 <==
Configuration file change settled, attempting to update configuration.
tr_parse_config_files: Attempting to load revision 1527816757 of './trusts.cfg'.
tr_parse_config_files: Attempting to load './internal.cfg'.
tr_print_config: Logging running trust router configuration.
tr_print_config: Community apc.sports.psec.us:
tr_print_config:  - Member IdPs:
tr_print_config:    - @IDP realm: "apc.sports.psec.us"  shared: no  local: yes  AAA servers: apc.qa.painless-security.com:12309  APCs: apc.sports.psec.us
tr_print_config:    - @IDP realm: "redsox.com"  shared: no  local: yes  AAA servers: 3ac1b3-red-sox.tid.qa.painless-security.com:12309  APCs: apc.sports.psec.us
tr_print_config:    - @IDP realm: "royals.com"  shared: no  local: yes  AAA servers: 8a0d78-royals.tid.qa.painless-security.com:12309  APCs: apc.sports.psec.us
tr_print_config:    - @IDP realm: "whitesox.mlb.us"  shared: no  local: yes  AAA servers: 51623a-white-sox.tid.qa.painless-security.com:12309  APCs: apc.sports.psec.us
tr_print_config:  - Member RPs:
tr_print_config:    - @RP realm: "mark.littleleague.org"

tr_print_config:    - @RP realm: "nymets.com"

tr_print_config:    - @RP realm: "redsox.com"

tr_print_config:    - @RP realm: "royals.com"

tr_print_config:    - @RP realm: "whitesox.mlb.us"

tr_print_config:    - @RP realm: "apc.sports.psec.us"

tr_print_config:    - @RP realm: "apc.sports.psec.us"

apc.sports.psec.us | apc.sports.psec.us |  | 0 | tr.qa.painless-security.com:12308 | :-1 | 1 | 1 | 1970-01-01 00:00:00 UTC | 0
apc.sports.psec.us | redsox.com |  | 0 | tr.qa.painless-security.com:12308 | :-1 | 1 | 1 | 1970-01-01 00:00:00 UTC | 0
apc.sports.psec.us | royals.com |  | 0 | tr.qa.painless-security.com:12308 | :-1 | 1 | 1 | 1970-01-01 00:00:00 UTC | 0
apc.sports.psec.us | whitesox.mlb.us |  | 0 | tr.qa.painless-security.com:12308 | :-1 | 1 | 1 | 1970-01-01 00:00:00 UTC | 0

>> Membership table start <<
* idp apc.sports.psec.us/apc.sports.psec.us
  null origin (0x1db59e0) - prov:

* idp redsox.com/apc.sports.psec.us
  null origin (0x1db5aa0) - prov:

* idp royals.com/apc.sports.psec.us
  null origin (0x1db5be0) - prov:

* idp whitesox.mlb.us/apc.sports.psec.us
  null origin (0x1db5d20) - prov:

* rp mark.littleleague.org/apc.sports.psec.us
  null origin (0x1db5ee0) - prov:

* rp nymets.com/apc.sports.psec.us
  null origin (0x1db60a0) - prov:

* rp redsox.com/apc.sports.psec.us
  null origin (0x1db6260) - prov:

* rp royals.com/apc.sports.psec.us
  null origin (0x1db6420) - prov:

* rp whitesox.mlb.us/apc.sports.psec.us
  null origin (0x1db6620) - prov:

* rp apc.sports.psec.us/apc.sports.psec.us
  null origin (0x1db6820) - prov:

Configuration updated successfully.

==> /var/log/messages <==
Jun  1 01:32:44 ip-172-31-47-134 trust_router[4673]: Configuration file change settled, attempting to update configuration.
Jun  1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_parse_config_files: Attempting to load revision 1527816757 of './trusts.cfg'.
Jun  1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_parse_config_files: Attempting to load './internal.cfg'.
Jun  1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_print_config: Logging running trust router configuration.
Jun  1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_print_config: Community apc.sports.psec.us:
Jun  1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_print_config:  - Member IdPs:
Jun  1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_print_config:    - @IDP realm: "apc.sports.psec.us"  shared: no  local: yes  AAA servers: apc.qa.painless-security.com:12309  APCs: apc.sports.psec.us
Jun  1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_print_config:    - @IDP realm: "redsox.com"  shared: no  local: yes  AAA servers: 3ac1b3-red-sox.tid.qa.painless-security.com:12309  APCs: apc.sports.psec.us
Jun  1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_print_config:    - @IDP realm: "royals.com"  shared: no  local: yes  AAA servers: 8a0d78-royals.tid.qa.painless-security.com:12309  APCs: apc.sports.psec.us
Jun  1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_print_config:    - @IDP realm: "whitesox.mlb.us"  shared: no  local: yes  AAA servers: 51623a-white-sox.tid.qa.painless-security.com:12309  APCs: apc.sports.psec.us
Jun  1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_print_config:  - Member RPs:
Jun  1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_print_config:    - @RP realm: "mark.littleleague.org"
Jun  1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_print_config:    - @RP realm: "nymets.com"
Jun  1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_print_config:    - @RP realm: "redsox.com"
Jun  1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_print_config:    - @RP realm: "royals.com"
Jun  1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_print_config:    - @RP realm: "whitesox.mlb.us"
Jun  1 01:32:44 ip-172-31-47-134 trust_router[4673]: tr_print_config:    - @RP realm: "apc.sports.psec.us"
Jun  1 01:32:44 ip-172-31-47-134 trust_router[4673]: apc.sports.psec.us | apc.sports.psec.us |  | 0 | tr.qa.painless-security.com:12308 | :-1 | 1 | 1 | 1970-01-01 00:00:00 UTC | 0#012apc.sports.psec.us | redsox.com |  | 0 | tr.qa.painless-security.com:12308 | :-1 | 1 | 1 | 1970-01-01 00:00:00 UTC | 0#012apc.sports.psec.us | royals.com |  | 0 | tr.qa.painless-security.com:12308 | :-1 | 1 | 1 | 1970-01-01 00:00:00 UTC | 0#012apc.sports.psec.us | whitesox.mlb.us |  | 0 | tr.qa.painless-security.com:12308 | :-1 | 1 | 1 | 1970-01-01 00:00:00 UTC | 0
Jun  1 01:32:44 ip-172-31-47-134 trust_router[4673]: >> Membership table start <<#012* idp apc.sports.psec.us/apc.sports.psec.us#012  null origin (0x1db59e0) - prov: #012#012* idp redsox.com/apc.sports.psec.us#012  null origin (0x1db5aa0) - prov: #012#012* idp royals.com/apc.sports.psec.us#012  null origin (0x1db5be0) - prov: #012#012* idp whitesox.mlb.us/apc.sports.psec.us#012  null origin (0x1db5d20) - prov: #012#012* rp mark.littleleague.org/apc.sports.psec.us#012  null origin (0x1db5ee0) - prov: #012#012* rp nymets.com/apc.sports.psec.us#012  null origin (0x1db60a0) - prov: #012#012* rp redsox.com/apc.sports.psec.us#012  null origin (0x1db6260) - prov: #012#012* rp royals.com/apc.sports.psec.us#012  null origin (0x1db6420) - prov: #012#012* rp whitesox.mlb.us/apc.sports.psec.us#012  null origin (0x1db6620) - prov: #012#012* rp apc.sports.psec.us/apc.sports.psec.us#012  null origin (0x1db6820) - prov:
Jun  1 01:32:44 ip-172-31-47-134 trust_router[4673]: Configuration updated successfully.

It looks like this happened after its configuration file was updated. (I'm guessing that the update was effectively nothing but the serial number changing, since I wasn't using the portal at the time.)

jennifer-richards commented 6 years ago

I'm having trouble reproducing this by changing the serial number - I get a proper response even after half a dozen or so reloads.

jennifer-richards commented 6 years ago

Ok, just got it to happen.

jennifer-richards commented 6 years ago

This is happening because the callbacks that handle various monitoring commands pass a pointer to the trps instance that they need to get data from. When a configuration file is reloaded, a new trps is instantiated, but the monitoring handler still gets the old pointer, resulting in unpredictable behavior.

Edit: hmm, that's not quite right, but it's something like that. Getting closer...

jennifer-richards commented 6 years ago

Aha- it's easier than that. After loading a new peer table, none of the peers have had connection attempts until the next connection update time. There was a check in the monitoring message encoder that suppressed setting the "last_connection_time" field if it had not yet been set. The encoder was set to fail if that field was missing, so it failed until an attempt was made.

So to begin with, this was a transient issue (as soon as a connection attempt was made, it should work). In fact, it would have happened upon the first load except that the first connection attempts are made right after startup, so this is not likely to have been seen. The fix is simply to omit that field if it's irrelevant.

We could also make it say "none" or something, but I think it's easier for someone consuming the monitoring JSON message not to have to deal with a field that is sometimes a UTC timestamp and sometimes a differently formatted string. If the field is absent, there have been no attempts.

jennifer-richards commented 6 years ago

Should be fixed by 2cafd9dbafa262d1d29ce0af0590c3aa36b527fe

jennifer-richards commented 6 years ago

I'm closing this. Reopen if it comes up again.