Closed ndouglas closed 1 year ago
I like it, I can't think of much else to contribute here. I think this is what Mike Chelen was essentially asking for when we went through Staging deploy and Test epic to reduce the overall time it takes. We just leaned on Jenkins job metrics to inform if we were succeeding. The pitfall there is we really can't go back and point to the data. Whereas we implement this it's a lot more clear and will provide historical context.
Deployment Frequency is a bit boring. But if we did it for Staging versus Production it might help us uncover issues with webhooks firing. If PR merges to main are higher than Staging Deploys that would indicate a problem, being that it should be 1 to 1.
@ndouglas @olivereri I'm good to move forward with this in a hypothesis manner (e.g. we think these metrics will best represent a good 1st slice of measurements based on what we know now). Who all has access to datadog and/or how can we best socialize these measurements in an ongoing manner (once we feel comfortable they are accurate and not especially negative). Though we didn't have a chance to refine together (pretty much only story pointing left), moving into STRETCH for Nate to tear into when he's back next week.
Assuming 8 story points for the purposes of planning
I was intending to use events for A, B, C, D, E, then use those events to calculate metrics. I don't think that's going to work well; it would seem to require that I set a tag value for the event to the commit SHA, which would then cause the custom metrics billing to scale according to the number of commits we make! This would be very wasteful financially.
After some thinking, I decided to just use the commit timestamp as stored in the Git history as the start point and compute dates relative to that for each subsequent step.
I don't think this actually changes anything, but wanted to note this for future reference. If we create similar issues in the future, we should beware of this cost scaling complication.
thanks @ndouglas! To be sure I'm tracking, what is the custom metrics billing again (who owns it, how it's used, etc.)?
Custom metrics within Datadog. I believe the greater DSVA team owns Datadog, but I don't know how the billing, etc work to be honest. That hasn't complicated our team's life in the past and I don't expect it will in the future, but I could be wrong.
I might be missing what you're asking, though 🙂
My PRs above should accomplish the latter four metrics, but not deployment lead time. I'll probably need to open up a followup ticket for that.
This is all running and working, just need more data for the dashboard to look interesting and be useful.
Cool @ndouglas -- is this viewable on datadog?
Description
There are some metrics that we can and should be capturing for CMS product delivery performance, but that I don't believe we are at present.
We should ensure that these metrics are being recorded and reported upstream to Datadog.
These involve changes to the BRD CD pipeline, so each of these might be nontrivial and need to be split off into a separate issue.
Events
main
-> commit, timestampMetrics
Acceptance Criteria