gordon-cs / gordon-360-api

The 360° Gordon Experience
11 stars 6 forks source link

Introduce robust logging technique #1036

Closed EjPlatzer closed 4 months ago

EjPlatzer commented 4 months ago

I have added Serilog for "Simple .NET logging with fully-structured events". With a relatively simple setup, Serilog enables us to write structured log data to the console (for development and live debugging) as well as to files in JSON format. Since the logs are structured, we can ingest the JSON files easily for analysis (such as what is mentioned in #940).

I have currently configured Serilog to log all debug or more important level events produced in our code (currently only the one I added as a demo), as well as any warnings, errors, or critical failures produced from Microsoft's frameworks. It is trivial to update these settings and we can make them more nuanced as time goes on, including filtering logs to certain sinks based on the properties of each log.

The current sinks are:

  1. The Console, which captures all logs of the minimum level. This is useful for development and debugging. We can choose to turn it off in production if we desire.
  2. An error JSON log file, which will only capture Error and Fatal level events. This is useful for quick diagnosis of issues.
  3. An info JSON log file that is broadly capturing logs in a structured format. Currently, a new file will be created each day, or each time the current file reaches 1GB in size. Only the 14 most recent log files will be kept, to prevent logs from filling our server's storage.

Possibly Closes #940, since this enables us to not only add logs wherever is useful in our source code, but also configure any kind of logging middleware (in addition to or replacing Serilog's default request logging middleware). Also, Serilog supports writing logs to SQL Server if we decide we want that.

russtuck commented 4 months ago

Thank you for doing this: choosing a package and getting us started using it is a critical first step. However, it's only the first step.

We should log all API queries, as your initial comment suggests, because I think they will be valuable context for machine learning. It thrives on rich data. (There's a little detail more below.)

Logs need to be saved indefinitely to be useful for machine learning. Short time frames won't have enough data, won't see patterns in individual usage, won't allow repeatable training to test ML approaches, and will have almost no data in the summer when we have time to work on this.

(This demo only logs advanced people search queries, which aren't the most interesting. The most interesting search queries for logging are the "quick search", at Controllers/AccountsController.cs line 54, and this isn't logged. We should at least log all search queries (so also line 71). But again, we really should log all API requests. Sorry for the primitive line number references. It appears I can't add per-line comments to a merged PR.)