Azure / apiops

APIOps applies the concepts of GitOps and DevOps to API deployment. By using practices from these two methodologies, APIOps can enable everyone involved in the lifecycle of API design, development, and deployment with self-service and automated tools to ensure the quality of the specifications and APIs that they’re building.
https://azure.github.io/apiops
MIT License
328 stars 193 forks source link

[BUG] Huge APIs (with many Operations and other artifacts) have the publisher tool fail #689

Open TechPrototyper opened 1 month ago

TechPrototyper commented 1 month ago

Release version

5.00 and onwards

Describe the bug

I have tried to transfer "The Jira Service Management Public REST API" from one Service to another using the publisher tool. As the amount of operations and possibly other artifacts is significant, the Management API Call to create or update the API results in the Management API returning 422 after a while. Which is fine if it was handled, but of course that is not the case, so the Publisher fails.

Expected behavior

Here is what I expect publisher to do:

  1. Catch the 422 Error: Modify the Publisher Tool to explicitly catch the 422 error. The tool should recognize this specific HTTP status code and handle it appropriately rather than immediately retrying or failing.
  2. Implement a Status Check: After catching the 422 error, introduce a function to query the status of the API operation using the Azure Management API. You can use endpoints like GET /apis/{apiId} to check the current status of the ongoing API update.
  3. Retry Mechanism: If the status indicates that the API operation is still "in progress," implement a retry loop with a delay (e.g., using sleep) until the process is finished. Make sure to introduce a sensible timeout to avoid indefinite retries.
  4. Proceed or Abort Workflow: Based on the final status of the operation, either proceed with the next step of the deployment if the update was successful or abort the workflow with an appropriate error message if the operation ultimately fails.

This approach ensured that the script avoided unnecessary retries and handled conflicts like the 409 error by waiting for the ongoing operation to complete before attempting further updates.

Actual behavior

After the timeout, the Management APIs return of 422 breaks publisher which aborts with an error. Actually, the requested operation is continued to be carried out in the backend, so ultimately business-logic wise the call has succeeded, we just don't know about it and are left with an unhandled exception.

Reproduction Steps

I'd try to have publisher publish the "The Jira Service Management Public REST API", or any other API with a HUGE amount of operations and perhaps other artifacts.

github-actions[bot] commented 1 month ago
  Thank you for opening this issue! Please be patient while we will look into it and get back to you as this is an open source project. In the meantime make sure you take a look at the [closed issues](https://github.com/Azure/apiops/issues?q=is%3Aissue+is%3Aclosed) in case your question has already been answered. Don't forget to provide any additional information if needed (e.g. scrubbed logs, detailed feature requests,etc.).
  Whenever it's feasible, please don't hesitate to send a Pull Request (PR) our way. We'd greatly appreciate it, and we'll gladly assess and incorporate your changes.
guythetechie commented 3 days ago

@TechPrototyper - status code 422 (unprocessable entity) means that there's a problem with the request. It's typically not something that will be fixed by a retry. We've successfully tested ApiOps in projects with thousands of operations.

Can you give more context on what's going on? The exception should have the failing URL.