HEPCloud / decisionengine

HEPCloud Decision Engine framework
Apache License 2.0
6 stars 25 forks source link

Decision Engine logging credentials to system logs #702

Closed namrathaurs closed 3 months ago

namrathaurs commented 3 months ago

Nick reported that he noticed that decision engines running on gpde01 and cmsde01 are logging all the aws credentials(tokens, client id, client secret, etc) to /var/log/messages in plain text. The DE version running (on both I think) were 1.7.5-1...

A sample syslog entry looks like this:

Mar 25 16:21:07 gpde01 decisionengine: {'AssumedRoleUser': {'Arn': 'arn:aws:sts::REDACTED:assumed-role/CalculateBill/roleSwitchSession',
Mar 25 16:21:07 gpde01 decisionengine: 'AssumedRoleId': ‘REDACTED:roleSwitchSession'},
Mar 25 16:21:07 gpde01 decisionengine: 'Credentials': {'AccessKeyId': ‘REDACTED',
Mar 25 16:21:07 gpde01 decisionengine: 'Expiration': datetime.datetime(2024, 3, 25, 21, 19, 40, tzinfo=tzutc()),
Mar 25 16:21:07 gpde01 decisionengine: 'SecretAccessKey': ‘REDACTED',
Mar 25 16:21:07 gpde01 decisionengine: 'SessionToken': ‘READCTED REALLY LONG TOKEN LINE,
Mar 25 16:21:07 gpde01 decisionengine: 'ResponseMetadata': {'HTTPHeaders': {'content-length': '1078',
Mar 25 16:21:07 gpde01 decisionengine: 'content-type': 'text/xml',
Mar 25 16:21:07 gpde01 decisionengine: 'date': 'Mon, 25 Mar 2024 20:19:40 GMT',
Mar 25 16:21:07 gpde01 decisionengine: 'x-amzn-requestid': ‘REDACTED JUST IN CASE'},
Mar 25 16:21:07 gpde01 decisionengine: 'HTTPStatusCode': 200,
Mar 25 16:21:07 gpde01 decisionengine: 'RequestId': ‘REDACTED AGAIN JUST IN CASEE',
Mar 25 16:21:07 gpde01 decisionengine: 'RetryAttempts': 0}}
namrathaurs commented 3 months ago

Investigation/Resolution Notes:

Nick was not sure if this was normal behavior or this was happening as a consequence of debug mode enabled. He tried to disable the debug mode setting in /etc/decisionengine/decision_engine.jsonnet where he switched the global_channel_log_level to INFO from DEBUG. However, DE still continued to spit out credentials into syslog.

Steve identified that the offending line code is line 136 of AWSBillAnalysis.py:

pprint.pprint(response)

This file is neither a part of the AWS code base nor the DE codebase but rather is part of an external package called bill-calculator-hep, which is developed/maintained at Fermilab and was last modified around four years ago. He further added that since it is an external package it knows nothing about the hepcloud logging structure and is not using any of the logging facilities used by DE. So the appropriate fix would be to just comment out the print statement. I looked further in the code to see if commenting that statement would affect anything later on, but that was not the case.

Steve also mentioned that this issue was first reported in https://github.com/HEPCloud/decisionengine/issues/635, which was on DE 2.x, with the note that this issue would affect both 2.x and 1.7.x versions since both the versions rely on the same external module.

I had no idea of how frequently the external module via DE writes messages to the log under /var/log/messages and would therefore be hard to monitor whether there was any change in behavior after applying the suggested fix. Since I do not have admin privileges on gpde01, relayed the above information to Nick to make the change on gpde01 and do a quick test to see if that, at the least, stops the credentials to be spit out. Both Nick and I assumed that the print statement might have seeped in as a result of initial debugging or something previously. Nick confirmed that commenting out the print statement worked and would reflect that on the rest of the Decision Engines running.