Open gh-skatt opened 3 years ago
Hi @gh-skatt and thanks for raising this.
Looking into the error path I found that it is generated from the Vault API and includes newlines which I believe is causing the issue here. Are there any additional lines below this message which could be associated to this top level log line?
In order to fix this I believe one solution would be to replace all the newline occurrences with another character, to ensure a nicer output experience.
If there are any additional lines associated with this event they are not picked up by the splunk forwarder, but I find that unlikely.
Just before the event in question, I see the vault agent process has logged this:
Aug 11 11:50:29 hostname vault: 2021-08-11T11:50:29.122+0200 [INFO] cache.apiproxy: forwarding request: method=POST path=/v1/auth/token/lookup
Aug 11 11:50:29 hostname vault: 2021-08-11T11:50:29.122+0200 [INFO] cache: received request: method=POST path=/v1/auth/token/lookup
Aug 11 11:50:29 hostname vault: 2021-08-11T11:50:29.120+0200 [INFO] cache.apiproxy: forwarding request: method=GET path=/v1/auth/token/lookup-self
Aug 11 11:50:29 hostname vault: 2021-08-11T11:50:29.120+0200 [INFO] cache: received request: method=GET path=/v1/auth/token/lookup-self
Aug 11 11:50:28 hostname vault: 2021-08-11T11:50:28.571+0200 [INFO] cache.apiproxy: forwarding request: method=GET path=/v1/sys/seal-status
Aug 11 11:50:28 hostname vault: 2021-08-11T11:50:28.571+0200 [INFO] cache: received request: method=GET path=/v1/sys/seal-status
I suspect the fix here will involve a new agent logging config parameter that allows forcing all log lines to have their internal newlines escaped. We'll need to check with Consul and Vault for prior art here on naming/behavior, but I suspect this is what's going on.
Nomad version
1.1.3
Operating system and Environment details
CentOS 7
Issue
The "failed to validate self token" error message is logged without proper quotation marks around the error message "failed to lookup Vault periodic token: Error making API request.".
The a closing quotation mark is missing, which confuses log system field extraction systems.
Example:
Reproduction steps
Expected Result
We expect there to be a double quotation mark at the end of the logged message, like this:
This should give log aggregators a better chance at parsing this event correctly, and then extracting the error message.
Actual Result
Job file (if appropriate)
Nomad Server logs (if appropriate)
Nomad Client logs (if appropriate)