microsoft / PubSec-Info-Assistant

Information Assistant, built with Azure OpenAI Service, Industry Accelerator
MIT License
348 stars 788 forks source link

Retrieving queue properties for storage account fails: Context Deadline exceeded #890

Closed kevball2 closed 4 weeks ago

kevball2 commented 1 month ago

Before you open an issue, please check if a similar issue already exists or has been closed before.

You can also find details on Troubleshooting Common Issues. You can use these tools to help gather additional logs and details to include in your issue.

:warning: Please DO NOT include confidential information in your issue on GitHub. :warning:

Bug Details

Describe the bug Terraform is unable to contact the storage account when attempting to redeploy resources that are stored in a state file.

Error: retrieving queue properties for Storage Account (Subscription: "" │ Resource Group Name: "" │ Storage Account Name: ""): executing request: Get "https://.queue.core.usgovcloudapi.net/?comp=properties&restype=service": context deadline exceeded │

Steps To Reproduce

  1. Deploy resource with state being stored in a storage account.
  2. Re-deploy resources, deployment will fail trying to get queue information for storage account.
  3. ...

What is the expected behavior? Re-deployments should complete successfully modifying the resources only if the terraform has changed.

Screenshots If applicable, add screenshots to help explain your problem.

Information Assistant details

Please provide the following details. You can simply include a screenshot of your Info panel as well.

GitHub branch: [e.g. main]

Version or Latest commit: [obtained by running git log -n 1 <branchname>

What region is your Azure Open AI Service in?

What ChatGPT model are you using?

model name: (i.e. gpt-3.5-turbo, gpt-4)

model version: (i.e. 0613)

What embeddings model are you using?

Additional context Add any other context about the problem here.

If the bug is confirmed, would you be willing to submit a PR?

bjakems commented 1 month ago

Which version of IA are you deploying? Is it in secure mode? Have you tried multiple attempts to redeploy?

kevball2 commented 1 month ago

Version 1.2 We are using a modified version of secure mode (targeting an existing virtual network and centrally managed private link DNS zones) Initial deploy completes but on any new pipeline run the deployment fails with that error.

bjakems commented 1 month ago

Ensure where you are running the deployment has access to the network restricted assets. After initial deployment, the Azure resources are network restricted and you must ensure network connectivity (VPN, jump box, etc) to communicate with the resources on redeployment. In rare circumstances, attempt the deployment again to see if that alleviates the Terraform to Azure issue.

Are you running the version 1.2 deployment on a Government Cloud environment?

kevball2 commented 1 month ago

This deployment is in Azure US Gov yes. Bonus complications include GitLab and the Runner pool are deployed in AWS, that traffic traverses back to our Datacenter then across our express route to Azure. Checking with the various FW teams to make sure all the services involved can talk to each other.

kevball2 commented 1 month ago

Updates: Found an internal block from our Palo Alto Firewalls. There are unique rules for US Gov storage Image The GitLab runner IP range does not have these specific rules applied. A request has been made to add them and I'll hopefully test again tomorrow

bjakems commented 1 month ago

Got it. Even when the networking is perfect, I've seen this Terraform timeout issue, but it is a rare occurrence.

Whenever you run this deployment, please check the Azure Portal for resource creation. If all assets do not create, then most likely the issue is on your networking configuration.

kevball2 commented 4 weeks ago

Closing this issue, we are able to successfully deploy all resources and update those resources after the correct firewall rules were applied.