matrix-org / sygnal

Sygnal: reference Push Gateway for Matrix
Apache License 2.0
160 stars 145 forks source link

Simplify request log format #153

Open abeluck opened 3 years ago

abeluck commented 3 years ago

Sygnal is logging HTTP requests using the Twisted request logger. But it is wrapping it in a format string that muddies the logs for no good reason:

https://github.com/matrix-org/sygnal/blob/53015c70785dea441606d1504667d71c5b99eabc/sygnal/http.py#L326

This results in log events like:

2020-09-21 12:25:57,734 - 2986 - INFO - sygnal.access - Handled request: "-" - - [21/Sep/2020:12:25:57 +0000] "GET / HTTP/1.1" 404 153 "-" "ELB-HealthChecker/2.0"

All the interesting request information is inside a string. Also the timestamp is repeated unnecessarily.

chagai95 commented 3 years ago

is this related?

https://stackoverflow.com/questions/18670252/how-to-suppress-noisy-factory-started-stopped-log-messages-from-twisted

aitorres commented 3 years ago

Looking into this issue to see if I can help. It does seem to be related to twisted logging.

This line right here: https://github.com/matrix-org/sygnal/blob/53015c70785dea441606d1504667d71c5b99eabc/sygnal/http.py#L325

The log formatter that is used ends up being an instance of twisted.web.http.combinedLogFormatter which returns a string with the format:

'"%(ip)s" - - %(timestamp)s "%(method)s %(uri)s %(protocol)s" %(code)d %(length)s "%(referrer)s" "%(agent)s"'

If all the information but the repeated timestamp should stay, we could strip it with a regex or something (if it's always in brackets).

Besides removing the duplicate timestamp, what else should be done? Removing the "Handled request:" part of the string?