mozilla / webcompat-dashboard

Web Compatibility Knowledge Base Next Generation Dashboard
Mozilla Public License 2.0
2 stars 2 forks source link

Add a bit of logging to everything. #17

Closed denschub closed 6 months ago

denschub commented 6 months ago

This PR adds logging. There are three main components:

1I added winston, which appears to be the de-facto standard logging library for a lot of cases to emit JSONL logs for everything. I'm personally not a huge fan of JSONL logs, but the GCP Log Explorer can parse that and offer filters. It's also what the nginx/openresty container emit, so this matches the existing setup. A benefit of this is that we can use the filter features int the Log explorer more effectively.

Using morgan, a project maintained by the expressjs team, the server now emits a log line for every successful request (if LOG_LEVEL >= info). There is a bit of boilerplate code which doesn't look amazing, but the motivation here is to match the JSON payload sent by the reverse proxy, so we can use the same filters. I hope this helps figuring out why heartbeat requests fail sometimes. An example log line looks like this:

{"level":"info","log_type":"access","message":"","remote_addr":"::ffff:127.0.0.1","request":"GET /__heartbeat__ HTTP/1.1","request_time":"2.084","status":"200","timestamp":"2024-04-10T11:06:36.815Z","user_agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:126.0) Gecko/20100101 Firefox/126.0"}

I added more verbose logging in the handleUserReports function. If emitted (LOG_LEVEL >= verbose), we receive a more detailed step-by-step logging output during the request, which hopefully helps us narrow down why the requests are so slow. An example of how this looks:

{"handler":"handleUserReports","level":"verbose","message":"Entered handler","timestamp":"2024-04-10T11:06:52.980Z"}
{"handler":"handleUserReports","level":"verbose","message":"Connecting to BigQuery...","timestamp":"2024-04-10T11:06:52.982Z"}
{"handler":"handleUserReports","level":"verbose","message":"Starting queries...","timestamp":"2024-04-10T11:06:52.985Z"}
{"handler":"handleUserReports","level":"verbose","message":"Received 15480 user reports and 317 URL patterns.","timestamp":"2024-04-10T11:07:01.810Z"}
{"handler":"handleUserReports","level":"verbose","message":"Pre-processing URL patterns...","timestamp":"2024-04-10T11:07:01.810Z"}
{"handler":"handleUserReports","level":"verbose","message":"Pre-processing user reports...","timestamp":"2024-04-10T11:07:01.810Z"}
{"handler":"handleUserReports","level":"verbose","message":"Grouping reports by root domain...","timestamp":"2024-04-10T11:07:02.051Z"}
{"handler":"handleUserReports","level":"verbose","message":"Partitioning reports into known/unknown...","timestamp":"2024-04-10T11:07:06.024Z"}
{"handler":"handleUserReports","level":"verbose","message":"Sorting by the total number of reports per domain...","timestamp":"2024-04-10T11:07:06.031Z"}
{"handler":"handleUserReports","level":"verbose","message":"Writing response...","timestamp":"2024-04-10T11:07:06.033Z"}
{"handler":"handleUserReports","level":"verbose","message":"Handler done.","timestamp":"2024-04-10T11:07:06.068Z"}
{"level":"info","log_type":"access","message":"","remote_addr":"::ffff:127.0.0.1","request":"GET /api/user_reports.json?from=2024-03-30&to=2024-04-06 HTTP/1.1","request_time":"13083.070","status":"200","timestamp":"2024-04-10T11:07:06.122Z","user_agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:126.0) Gecko/20100101 Firefox/126.0"}

I thought about adding a request-uuid so we can correlate the log entries with each other. That'd be easy to add, but it adds more boilerplate code, and the app isn't high-traffic enough for that to be a requirement, I think.

The default LOG_LEVEL is verbose, so we don't need to change any deployment settings to get what we want.

r? @moztcampbell

denschub commented 6 months ago

I added a second commit that updates all dependencies, because that's never a bad idea. We should set up dependabot or something similar eventually, but for now, this is good enough. The change to tsconfig.json in that commit is from this change in prettier. If you prefer I adjust auto-formatting in a third commit (or extract the update-commits into a follow-up PR), let me know.