Closed nepsmaddy closed 3 months ago
Thank you for opening this issue! Please be patient while we will look into it and get back to you as this is an open source project. In the meantime make sure you take a look at the [closed issues](https://github.com/Azure/apiops/issues?q=is%3Aissue+is%3Aclosed) in case your question has already been answered. Don't forget to provide any additional information if needed (e.g. scrubbed logs, detailed feature requests,etc.).
Whenever it's feasible, please don't hesitate to send a Pull Request (PR) our way. We'd greatly appreciate it, and we'll gladly assess and incorporate your changes.
@nepsmaddy - just to be clear: is the extractor creating artifacts for resources that should be skipped? Or are you just noticing references to other resource names in the logs?
@guythetechie , yes logs creation as well. At first point extractor should not scan anything except the configuration provided. Also i noticed this was working as expected in v4.7.0. But in latest i see its generating all this logs as well.
+1, as in our environment this is quite time consuming (300+ API's) I don't know why the extractor needs to loop through all resources when a specific resource set is defined in the extractor config. Even worse is the fact, that the extractor hasn't got a proper subscription read limit handling, so that long running operations often result in a SubscriptionRequestsThrottled
error, e.g.:
System.Net.Http.HttpRequestException: HTTP request to URI https://management.azure.com/subscriptions/***/resourceGroups/***/providers/Microsoft.ApiManagement/service/***/apiVersionSets/***?api-version=2023-09-01-preview failed with status code 429. Content is '{"error":{"code":"SubscriptionRequestsThrottled","message":"Number of 'read' requests for subscription actor '***:***' exceeded. Please try again after '1' seconds after additional tokens are available. Refer to https://aka.ms/arm-throttling for additional information."}}'.
Causing the entire extraction to fail occasionally.
@nepsmaddy - we've always retrieved all APIs, then filtered by API name in configuration. v4.7.0 behaves the same way. There are two major differences:
@DSpirit - could you define "quite time consuming"? How long does it take to run on 300+ APIs? Will add to our backlog for fixing, but prioritization will depend on how bad it is.
After merging my change from #612 my pipeline retries became unnecessary, so extraction dropped from 25 mins to 3-6 minutes for a single API. Extracting all assets takes about 8-11 minutes. This is acceptable however, just with the missing 429 handling it became really annoying. Sure this could be improved for single API extraction, but for now it's completely fine, since the APIM Resource Kit hasn't been any better :)
Thanks for the quick feedback today, really appreciate it!
@guythetechie Nevertheless, this behavior in v6 is a huge problem for larger APIM instances (around 850 APIs)
In our case, the extractor runs for about 8 minutes to loop trough the 1700 NamedValues alone. All the other parts (Tags, Products, Subscriptions and so on) add up to a total time of more than 50 Minutes to extract a single API.
The same extraction for a single API with everything else is set to [ignore] finished in under 1 minute in v5.1.4.
Thanks for the feedback, all. Will prioritize addressing this.
For me as well in v6, my extractor is taking almost 50 mins to complete the run.
Fix pushed to main branch, should be deployed in our next release.
Release version
v6.0.1-rc1
Describe the bug
When providing specific details to extract for the api, extractor should only fetch the specific data and complete the process.
This was working fine in old release v4.7.x but in new release it is trying read through all the metadata which is causing lot of time to run the extractor instead of seconds.
Consider below is my extractor config looks like as example.
apiNames:
backendNames: [ignore]
namedValueNames:
productNames: [ignore]
tagNames:
diagnosticNames: [ignore] loggerNames: [ignore] policyFragmentNames: [ignore] subscriptionNames: [ignore]
Expected behavior
Now as per the config, it extractor should only scan specific api, namedvalues and tags. rest it should not worry and give the artifacts.
Note: This is exactly same behavior in 4.7.x but not is current release. it scan through everything in current release and taking long time to do it.
Actual behavior
Extractor scan through the configs given in config.yaml inclusive of all other metadata of the apim instance which are not required, showing as warning and skipped the resource in logs. please find below snippet for the same.
warn: extractor.ShouldExtractFactory[0] NamedValueName allegroinvoice-mugf-sappipo-prx-password is not in configuration and will be skipped. warn: extractor.ShouldExtractFactory[0] NamedValueName allegroinvoice-mugf-sappipo-prx-username is not in configuration and will be skipped. warn: extractor.ShouldExtractFactory[0] NamedValueName ami-ccevents-password is not in configuration and will be skipped. warn: extractor.ShouldExtractFactory[0] NamedValueName ami-ccevents-subscription-key is not in configuration and will be skipped. warn: extractor.ShouldExtractFactory[0] NamedValueName ami-ccevents-username is not in configuration and will be skipped. warn: extractor.ShouldExtractFactory[0] NamedValueName ami-optout-client-id is not in configuration and will be skipped. warn: extractor.ShouldExtractFactory[0] NamedValueName ami-optout-password is not in configuration and will be skipped. warn: extractor.ShouldExtractFactory[0] NamedValueName ami-optout-secret is not in configuration and will be skipped. warn: extractor.ShouldExtractFactory[0] NamedValueName ami-optout-username is not in configuration and will be skipped.
Reproduction Steps