Closed ldraney closed 1 year ago
Per Lucas, there's a potential next ticket: If the acceptance criteria are not met after a correct installation, but we see the profiler showing up, then we need to implement Unified Service Tagging and make sure it works with our APM setup.
@ldraney Is my understanding correct that this wouldn't have any specific QA considerations just yet..? Because we're in the building blocks stage, right..? If yes, I want to go ahead and ask @k-macmillan and @mjones-oddball for a review as we'd like to go ahead and bring this ticket into our sprint.
We are still not getting data for our deliver_sms endpoint, so we are going to follow Datadogs Python Custom Instrumentation, which is basically to add a decorator to our desired functions. I'll be adding it to app/celery/provider_tasks.py deliver_sms
Okay, this will be the final update on this ticket before closing it to create a new ticket, related but will be a new story to understand why the profiler is giving us information -- not the information we were expecting AND/OR incomplete information.
Our tracing shows the deliver_sms endpoint being used:
And, the decorate I added gives us new information whenever I send an sms in the dev environment:
But the profiler has not changed:
Again, closing now, please read the above comment. I will address this in upcoming Datadog meetings and study, and next ticket creation related to PR #1024
The primary goal for the next ticket will either to be figure out this profiler bug, or perhaps to dive deeper into instrumentation; or perhaps related to the unified tagging schema.
Value Statement
As a DevOps Engineer I want to enable code-level performance profiles and tracing So that we can observe the performance of our code base
Additional Info and Resources
_Overview - From Datadog Profiler Documentation_
In this table, the first row contains the column headers "FUNCTION" and "CPUUSAGE". The When working on performance problems, this information is important because many programs spend a lot of time in a few places, which may not be obvious. Guessing at which parts of a program to optimize causes engineers to spend a lot of time with little results. By using a profiler, you can find exactly which parts of the code to optimize.
If you’ve used an APM tool, you might think of profiling like a “deeper” tracer that provides a fine grained view of your code without needing any instrumentation.
The Datadog Continuous Profiler can track various types of “work”, including CPU usage, amount and types of objects being allocated in memory, time spent waiting to acquire locks, amount of network or file I/O, and more. The profile types available depend on the language being profiled.
Engineering Checklist
Acceptance Criteria
Have Profiles set up similar to what is available on vagov.ddog-gov.com(You'll have to have looged into the okta dashboard for this link to work)
We should see something like (I'm not sure the exact names of functions/endpoints for our endpoints, but in general): | flask | 48% | | flask.sms | 19% | | flask.email | 13% |
GIVEN we have active Endpoints (sms) WHEN those endpoints are being used THEN we can see the performance of those parts of our code
Identify how long an sms request took start to finish within the
post_notification
method and compare that to what is reported in cloudwatch.