feat: SDK migration single resource scraping

hkfgo commented 2 months ago

This PR implements the Azure Monitor SDK migration. It migrates away from the deprecated Azure Fluent SDK to Azure SDK for .NET. Specifically, it uses the Azure.Monitor.Query package to implement integration. Since the new SDK is essentially a different wrapper around the same REST API, I'd expect identical behavior in terms of

Billing(free through ARM)
Metric querying
Meta-level metrics to track ARM throttling
Meta-level metrics to track usage

A summary of how I did it:

Abstracted Azure Monitor integration through the IAzureMonitorClient interface.
Implemented Azure Monitor integration using the new SDK, under the IAzureMonitorClient interface
Implemented high-level control flow to use either the new client or the legacy client, depending on feature flag

Things I need help with:

If this looks like a good approach to you
I need some pointers on how to test this :( Specifically, how do you think end-to-end testing should be performed for a major PR like this? I was thinking either integration tests or building a custom branch image and deploying to our test environment. I'm not quite sure how to do either of them though. Some help would be much appreciated!

Relates to https://github.com/tomkerkhove/promitor/issues/2209

github-actions[bot] commented 2 months ago

Thank you for your contribution! 🙏 We will review it as soon as possible.

tomkerkhove commented 2 months ago

I need some pointers on how to test this :( Specifically, how do you think end-to-end testing should be performed for a major PR like this? I was thinking either integration tests or building a custom branch image and deploying to our test environment. I'm not quite sure how to do either of them though. Some help would be much appreciated!

This should be covered by the testing sweet so no worries!

hkfgo commented 2 months ago

I need some pointers on how to test this :( Specifically, how do you think end-to-end testing should be performed for a major PR like this? I was thinking either integration tests or building a custom branch image and deploying to our test environment. I'm not quite sure how to do either of them though. Some help would be much appreciated!

This should be covered by the testing sweet so no worries!

Are these pipeline steps something I can repeatedly trigger on my own? I read the README more carefully and it seems like I can find my branch build under :pr{pr-id}. If I can re-trigger the CI pipeline on my own then that'd definitely cover end-to-end testing!

tomkerkhove commented 2 months ago

I need some pointers on how to test this :( Specifically, how do you think end-to-end testing should be performed for a major PR like this? I was thinking either integration tests or building a custom branch image and deploying to our test environment. I'm not quite sure how to do either of them though. Some help would be much appreciated!

This should be covered by the testing sweet so no worries!

Are these pipeline steps something I can repeatedly trigger on my own? I read the README more carefully and it seems like I can find my branch build under :pr{pr-id}. If I can re-trigger the CI pipeline on my own then that'd definitely cover end-to-end testing!

Hey, there are docs how to run things locally here: https://github.com/tomkerkhove/promitor/blob/master/CONTRIBUTING.md#net-development

The CI also runs all these integration tests automatically in case you were wondering

tomkerkhove commented 2 months ago

I see you are actively working on this so I'll hold off on release!

Do let me know if you need to put this on hold

hkfgo commented 2 months ago

Yes I am. Getting close

hkfgo commented 2 months ago

Good news! This most recent build appears to be working across all of our resource types. We have a good coverage within Axon, with Redis, Storage, Load Balancer, Disk, Azure SQL, Power Functions, etc.

Let me clean up the code more(I've added so many log statements!!). Meanwhile, do you suggest anything for end-to-end testing?

tomkerkhove commented 2 months ago

Good news!

The automated testing already happens and is reported below, the majority of the checks are passing except for a few 👌

hkfgo commented 1 month ago

Looks like all tests are passing now! The failing CodeFactor was on the giant MetricScraperFactory method to find the matching scraper. There's not much we can do I think.

Do you mind taking another look? Please ignore the log statements and modification to GitHub action for now. We can remove those before pressing the merge button

hkfgo commented 1 month ago

Also, any pointers on how to do remote debugging? I've found some online articles on remote debugging with VS Code + .NET + Kubernetes. Probably should have tried that to begin with instead of doing so many print statements..

tomkerkhove commented 1 month ago

Also, any pointers on how to do remote debugging? I've found some online articles on remote debugging with VS Code + .NET + Kubernetes. Probably should have tried that to begin with instead of doing so many print statements..

No, I always use VS to run the container locally and troubleshoot. If a running instance does not provide the insights you need, then we may be missing some logs

hkfgo commented 1 month ago

Also, I believe there should be two quick PRs to

Update documentation
Make useAzureMonitor flag available in the Promitor chart

Be on the look out :)

hkfgo commented 1 month ago

Promitor documentation PR: https://github.com/promitor/docs/pull/62 Promitor chart PR: https://github.com/promitor/charts/pull/168

I believe they are dependencies of the next release but not this PR getting merged. I'm making this distinction because I'm waiting on merge to master to rebase and continue batch scraping work. No rush though. Thanks!

tomkerkhove / promitor

feat: SDK migration single resource scraping #2470