medic / cht-user-management

GNU Affero General Public License v3.0
3 stars 1 forks source link

Get user management metrics #170

Closed ernestoteo closed 1 month ago

ernestoteo commented 4 months ago

68

ernestoteo commented 4 months ago

@kennsippell ,

I am trying to get some metrics for user management. I don't know if I am doing it the right way.

Would you please check the what I have done so far.

kennsippell commented 4 months ago

I just took a quick look to see which of the metrics we want would come for free if you just use fastify-metrics:

Metric Available By Default
Page views { by METHOD and path and status code } Yes
Logins { success count, failure count, p2 failure reason } Yes (not the p2)
Alerts for outages Yes
Users uploaded per instance over time No
Count of created users { success, failure, retries, p2 time to upload } No

3 out of 5 ain't bad. So I'd start with just this "defulat stuff" and then we can do custom metrics in separate PR

ernestoteo commented 4 months ago

Do you have Grafana running locally? Have you tried making some dashboards using this data? Can you send some screenshots of what the dashboards look like? If you haven't done this, then you haven't tested your change :D And that is the ultimate result of whether this is "right".

I thiinnkk the output you're writing is in the prometheus format (?) ... or am I mistaken? In terms of technical approach, my big feedback is to ask you to use libraries.

I recommend you do something like:

  1. Check out https://www.npmjs.com/package/fastify-metrics. Install it using npm install --save
  2. Register metrics as a fastify plugin. Follow the documentation. Just do the default things turning routeMetrics on.
  3. You should now have a new endpoint which outputs metrics in the format when you navigate to /prometheus (or whatever). You should not implement any custom prometheus code yourself imo - use libraries when available please.

Then, make some views in Grafana:

  1. Get cht-watchdog running locally
  2. Connect watchdog to your dev machine
  3. Create a new dashboard on your local watchdog
  4. Try to get as many metrics as you can from the metrics that are available. (For example you can get # of logins by looking at the number 200 response codes from the POST /authenticate endpoint and you don't need to do anything else)
  5. Send a PR with the app code, the grafana json, and screenshots

I think once you've done this, you'll understand the infrastucture and tools you're working with and have the full-stack setup on your machine and be ready for the next steps.

Schedule time with me if you get stuck.

Thank you @kennsippell ,

I have installed fastify-metrics and registered it as fastify plugin but to access the endpoint it requires a user authentication. Then endpoint should be opened.

How can I do that ?

mrjones-plip commented 3 months ago

I have installed fastify-metrics and registered it as fastify plugin but to access the endpoint it requires a user authentication. Then endpoint should be opened.

How can I do that ?

I suspect you need to add the /metrics path to an allow list so that the default authentication doesn't get triggered. I don't know that code base to well, but it looks like maybe we do this for /version here?

kennsippell commented 3 months ago

Endpoints are authenticated by default, see here for how to make your endpoint unauthenticated.

ernestoteo commented 3 months ago

@kennsippell ,

Find below some screenshots and the json file.

Screenshot 2024-06-18 at 01 02 09

<img width="678" alt="Screenshot 2024-06-18 at 01 08 07" src="https://github.com/medic/cht-user-management/assets/94995568/15334de8-97ec-4897-9

Screenshot 2024-06-18 at 00 33 16 Screenshot 2024-06-18 at 00 55 22 Screenshot 2024-06-18 at 00 55 31 Screenshot 2024-06-18 at 00 55 42 Screenshot 2024-06-18 at 01 01 55

45e-d12f3e5efa42">

ernestoteo commented 3 months ago

@mrjones-plip,

I have move the watchdog-config to script/deploy and fix all the feedback to you left. I have also updated the json file and I was able to upload it to my local instance and it works.

Would you please check it.

mrjones-plip commented 2 months ago

@ernestoteo - sorry for the delay! I was on Holiday until 8th. I have a busy week, but will try and get to this by Friday. Stay tuned!