Open vcschapp opened 1 month ago
I think this proposal is great, thanks for taking the time to write it down. I think it would also be interesting if you would present it in the monthly meeting.
A few notes worth looking into and take into consideration:
I definitely agree that 2-3 matrices are what we should aim for, but we might consider making sure we raise enough events with meaningful data so that someone that wants to measure something else from the defined 2-3 can do that on their on and make sure it is kept in a good state.
CC: @ibesora
This is great. I agree with @HarelM you should present it in the monthly meeting if possible. Some notes on my end:
The things that get measured are the things that improve.
put it on sticker! +1000
Heavily agree on raising "lots" of events even if we decide to only check two or three metrics. There will be different things to measure needed for different use cases.
agree with @ibesora here
overall, super supportive of getting this in! lmk how we can help, if at all
Anyone interested in working on this together? I am not super familiar with the gl js codebase, but happy to create a draft PR in the hope of eliciting better ideas.
Took a stab: https://github.com/maplibre/maplibre-gl-js/pull/4901.
This is a request for comments to start a discussion about some possible features. Let me know if it's not the right place, as I'm happy to move this issue or create a new one elsewhere.
Summary
This is a proposal to instrument MapLibre GL JS with a small set (no more than 2–3) of high-level performance metrics that can be used to drive sustained improvements in initial map load performance and interactivity. A decision framework for selecting metrics to track is given and three metrics are suggested based on the framework: underutilized network time, unrendered resource time, and cancelled network time.
The decision framework and the suggested metrics are independent proposals. For example, the decision framework may be more or less right but there may be better alternatives to the proposed metrics that more closely meet the framework goals; or the decision framework may require refinement.
Motivation
The things that get measured are the things that improve.
We want to improve the end-user experienced performance of interactive maps on MapLibre, including initial map load time and the time to completely respond to end user interactions like pans and zooms. To improve it, we need to measure it. To be more specific, we need to measure performance in a way that can feed mechanisms that drive performance improvements and prevent regressions.
Decision Framework
The following tenets are proposed for selecting metrics to instrument into MapLibre:
Metrics
The following metric are proposed to be instrumented into MapLibre GL JS:
Implementation Notes
The envisioned implementation is an opt-in observer pattern where an interested observer can install a metrics hook into the MapLibre GL JS client to receive periodic metric observations as they become available. The observer can choose what to do with the metrics at that point.
One example usage that is explicitly envisioned is a performance test or performance canary installed in a CI/CD pipeline. The performance canary would run a large program of repeatable map interactions on MapLibre instances with a metrics observation hook installed and publish the metrics to an appropriate metrics repository. The pipeline would have monitors set up to block the pipeline if the metrics cross specific alarm thresholds, and the metrics would be aggregated and fed into dashboards that are regularly monitored by humans who have an ownership stake in maps performance.
Footnotes:
¹ It doesn’t matter whether a request actually goes over the network or is served from a local cache such as the browser’s cache. What is important is that the MapLibre has access to the information needed to make the request but that it hasn’t actually made the request yet. ² Where “rendered to the end user” means submitted to whatever system is responsible for graphical rendering. So, e.g., it could mean “sent to the GPU”. ³ A resource can either be rendered or can stop needing to be rendered because MapLibre decides it’s obsolete, like a tile for a zoom level the user has since left.