membermatters / MemberMatters

An open source membership, access and payments portal for makerspaces and community groups.
https://membermatters.org
MIT License
40 stars 23 forks source link

Instrument the code with OpenTelemetry to improve monitoring abilities #256

Open proffalken opened 2 months ago

proffalken commented 2 months ago

Is your feature request related to a problem? Please describe.

OpenTelemetry is rapidly becoming the standard in sending metrics, logs, and traces to platforms such as Grafana, Datadog, Honeycomb.io, and many more.

In order to have confidence that the platform is performing as expected, it would be good to instrument the existing code to use OpenTelemetry as it removes any vendor lock-in around monitoring and observability whilst ensuring that administrators of the platform can see everything that is going on inside the system.

Describe the solution you'd like

Implement instrumentation on at least the backend platforms for the major events such as user sign-up, door activation, and interlock communication.

The default for OpenTelemetry is a NOOP, so even if this is in the code base, it will not affect any existing installations and the logging/metrics would be appended to the existing outputs rather than replacing them.

Describe alternatives you've considered

We could continue to go down the Prometheus route, but this only takes into account metrics, not logs and traces, which can be incredibly helpful when trying to troubleshoot a system split across multiple services, which seems to be the way that MemberMatters is headed.

Additional context

Full disclosure - I work for Grafana however OpenTelemetry would ensure that observability of Member Matters remains vendor agnostic.

This is something I'm happy to work on and contribute to for the backend, however frontend observability within OpenTelemetry is not well advanced and therefore we would need to decide whether to leave the frontend side of things for now, or look at a vendor-specific option such as Grafana Faro which is Open Source (Apache License) but is tied to Grafana rather than being compatible with other vendors in the same way that OpenTelemetry is.

proffalken commented 2 months ago

I've just remembered that there's support for Sentry in MemberMatters already.

Sentry are one of the few organisations who aren't involved with Open Telemetry, but I'm not suggesting we should get rid of Sentry support here, just augment it with a platform that is more open :)

jabelone commented 2 months ago

Funnily enough I’ve just gone through adding open telemetry to a C# codebase for work. I think it’s a good idea! I also think sentry and open telemetry are best suited for different things. Sentry is great for unhandled error handling, triage and tracking but open telemetry is great for metrics etc. We also already have a Prometheus endpoint thanks to a Django add on but I don’t think it’s well (if at all) documented, and it’s only for Django specific things.

Here’s some of the metrics it currently exports. IMG_6218

I think I had to disable the sentry feature in a previous release because it was causing issues with the configuration and I definitely haven’t checked it in a long time.

proffalken commented 2 months ago

Oh awesome, I made a start on this last night, I'll continue on it over the next few days and see where it goes.

I'm talking specifically about getting up and running with OTEL at Monitorama this year, so I may well use MemberMatters as the demo app rather than the robot arm I've been trying to design and build in my spare time, I'll see how it goes!

proffalken commented 2 months ago

Initial traces flowing into Grafana Cloud Tracing: image