Open bedeho opened 6 months ago
@ignazio-bovo @kdembler don't we already partially do that?
@ikprk perhaps we could collect a set of filters into a dashboard to cover off some of the errors?
Yeah I think combination of Sentry + metrics collection about dist/storage built into Atlas cover most of those
Background
An operator will right now have relatively poor visibility into the quality of the UX in their Atlas instance. There are some key metrics, specifically load times of various screens and assets and also error rates, which have a very significant impact on UX. There may be substantial systematic variation in what types of users are experiencing UX problems captured like these metrics, for example relating to type of device, browser, location, time of day, which infra providers are involved etc. Having greater visibility into these issues can allow for detection of problems, measurement of improvement as a result of various remedies, and doing deeper technical or operational diagnostics.
Proposal
Introduce a solution which gives the operator excellent visibility into such key metrics, both in terms of efficient data collection at scale. This solution should largely be relying on some industry standard SaS solution for doing the data collection and visualization, most of the work here should be to identify the appropriate such solution and then doing the integration work. Here are some requirements