medic / care-teams

For Product Management
0 stars 0 forks source link

Write high level SQL queries to analyze Apdex scores #3

Closed michaelkohn closed 3 months ago

michaelkohn commented 5 months ago

Success Criteria

Benmuiruri commented 4 months ago

Hi @michaelkohn here is the sql query I cam up with for getting the apdex per page and for the whole application.

Please review it and let me know if it's working as expected.

cc @latin-panda

Thanks

apdex.txt

michaelkohn commented 4 months ago

Thanks @Benmuiruri, good start, I think this is a good high level view. As a reminder... we are generating apdex scores so we can determine what area of the app is slow and we should focus on. Recognizing our context (projects upgrade infrequently and it's hard to get telemetry data from them), we have to make sure that the telemetry updates we deploy right now will give us the information we need to make decisions, because we won't have the luxury of being able to iterate on what we deploy or go back and ask projects for data again.

With that in mind, do you think if we received this output from a project that we would have enough to go on to decide where we should focus our development time to speed things up? Or be able to look at this and identify points in time where a slowdown started to occur? What would you want to see to be able to make that decision?

One recommendation... I think at a minimum we need to include each of the journeys we highlighted in https://github.com/medic/care-teams/issues/1

Screenshot 2024-02-08 at 3 49 27 PM

Benmuiruri commented 4 months ago

Hi @michaelkohn here are two updated queries for your review. detailed_apdex.txt simple_adpex.txt

michaelkohn commented 4 months ago

I put the query into a KlipFolio dashboard so more people can see it. It would also be beneficial to have a draft PR for the queries so it's easier to comment on the actual SQL.

Here are a couple comments....

detailed_apdex.txt

  1. I don't think those high level "Messages, Reports, Targets, Tasks" events are actionable, but I may be wrong. How were you expecting to interpret/use those?
  2. I also don't think it's useful to combine the site level (Overall Apdex) and the detail stuff in one query, I'd recommend having two separate queries, one that is just the Overall Apdex and the other which is the details (basically the contents of your "simple_apdex" query).

simple_apdex.txt

  1. There is a bunch of old data (ie keys whose names have changed), is there a way to get rid of that old data? It is much more difficult to see what we are working with / QA with all that in there.
  2. I think the query would be much easier to read and understand if you just queried from useview_telemetry_metrics instead of couchdb_users_meta. Was there a specific reason you queried from couchdb_users_meta? Is there something missing from useview_telemetry_metrics?
  3. One thing that might be helpful is to add a column which is the event_category from your other query. That way you can order by the event_category to keep everything from the same tab close to each other in the results set.
Benmuiruri commented 4 months ago

Hi @michaelkohn ,

detailed_apdex.txt

  1. The high level "Messages, Reports, Targets, Tasks" events were mainly for show, I will delete them.
  2. I will seperate the two queries for low-level and the overall apdex.

simple_apdex.txt

  1. one way to deal with the old data is to adjust the query to pick data from 12 Feb when we started the wider testing with the specific keys we are interested in.
  2. I didn't know about the useview_telemetry_metrics thanks I will use that instead.
  3. Thank you for the suggestion, I will add the event_category

In which Repo should I open the draft PR and liink the updated query? Is it in the care-teams ?

michaelkohn commented 4 months ago

one way to deal with the old data is to adjust the query to pick data from 12 Feb when we started the wider testing with the specific keys we are interested in.

Yep, and that will be a lot easier to do once you switch it to use useview_telemetry_metrics.

In which Repo should I open the draft PR and liink the updated query?

I don't know for sure, you'll need to check with someone else, but the other telemetry assets are here so probably somewhere around there.

garethbowen commented 4 months ago

@Benmuiruri I agree with Michael, let's put the queries in cht-couch2pg repo to keep them with similar queries.

latin-panda commented 4 months ago

@Benmuiruri is this still in progress? If not, please close it

Benmuiruri commented 4 months ago

Hi @garethbowen looking at the files in the suggested folder, the look like migration files that involve schema changes and data migrations.

I'd love your input, do I create a folder and add likelibs/useview_telemetry_metrics/queries then add the sql file there ?

Thanks.

garethbowen commented 4 months ago

Sorry, it's been a long time since I was familiar with that code so I'm not sure. Also beware that anything that drops existing views is extremely destructive (views outside this repo have been built and the cascading drop removes them all). I don't think this will be an issue for you as it's a whole new concept, but something to be aware of.

@1yuv I know you've worked here recently. Can you please help Ben contribute queries to cht-couch2pg?

michaelkohn commented 3 months ago

@Benmuiruri please reach out to @1yuv directly for advice on this

Benmuiruri commented 3 months ago

On it, I will take it up with @1yuv on Slack.

1yuv commented 3 months ago

Hi @Benmuiruri, I see that these queries are just plain query. Do you know if you plan to roll out these queries as view or materialized views?

In the couch2pg, we try to create minimum useful objects that can be beneficials for users to start their analytics. If you'd like to store these queries on couch2pg and have them available for everyone, please create view or materialized view. If you plan to query from useview_telemetry like @michaelkohn suggested here, then I suggest you create view instead of mat view. Then have them stored in this folder. Have the file named according to date, so couch2pg picks up that as latest migration.

Benmuiruri commented 3 months ago

Hi @michaelkohn as I create the view for the SQL query. Do we need both queries added or just the simple_padex added to couch2pg ? detailed_apdex.txt simple_adpex.txt

michaelkohn commented 3 months ago

Honestly at this point I'd just keep it as a query (or a parameterized function) instead of creating a view. For me, this query (as it is) has been helpful for level 1/high level investigation to help us identify some high level areas to further investigate, but it is more useful as a starting point for more detailed queries to pull in additional data (dates, users, devices) for deeper investigation.

I just wanted to get the query saved somewhere so that it is version controlled, easily commented on, and easily shared.

If we create a view from it now and deploy it with couch2pg, it becomes very hard to change in the future. I don't think we have enough experience using the apdex data just yet to know exactly what we'll want to see.

Benmuiruri commented 3 months ago

Hi @1yuv , given Michael's comment above that we want to keep it simple for now. Where can I put the sql ?

1yuv commented 3 months ago

If we want to store query only and not persist in any form on the database, I don't think it's necessary/ideal to store these on couch2pg library. There might be other places in forum or documentation as well. If we want to store definition like Michael suggested as parameterized function, they can still go in couch2pg.

Benmuiruri commented 3 months ago

I welcome your opinion @1yuv you are better informed to make the call. Then I will put it were you suggest and close this ticket. What's important is to have it somewhere appropriate.

1yuv commented 3 months ago

Let's put it somewhere in the cht-doc as a reference and continue to change. This way, it's version controlled, sharable, and people using cht-sync can also go easily and find.

Once we think queries are matured, we can add them to cht-couch2pg or to cht-sync based on maturity of the cht-sync.

Benmuiruri commented 3 months ago

Hi @antonykhaemba I am exploring the CHT docs to see where is an appropriate place to put the SQL query. I wonder if you have any suggestions.

  1. Perhaps adding another page under Quick Guides > Databases > Apdex Score using SQL Queries
  2. Adding another page underQuick Guides > Telemetry > Query Apdex Telemetry using SQL
mrjones-plip commented 3 months ago

@Benmuiruri - round trip decisions like this are expensive (take a long time!). Instead, I suggest something like:

Hey all! Based on my research, Quick Guides > Databases > Apdex Score using SQL Queries is the best place I can find in the docs to put these queries. I've put together a PR to this end and it will be merged in the coming day or so cc @antonykhaemba @1yuv

That way you can move full speed, not wait for anyone, but still have great transparency if some thinks it's a terrible idea. It's also trivial to move the content from one area in the docs to another. (BTW - I'm still scared a bit about making these types of decisions myself - you're in good company if you're scared too! \o/)

latin-panda commented 3 months ago

@Benmuiruri submitted a PR to cht-docs, and @michaelkohn will review it.

Moving this ticket to next week priorities

Benmuiruri commented 3 months ago

With the CHT docs PR merged we have documentation to get users started with SQL queries for apdex scores. I will now close this.