microsoft / PubSec-Info-Assistant

Information Assistant, built with Azure OpenAI Service, Industry Accelerator
MIT License
339 stars 744 forks source link

Deployment of Web App - numberOfInstancesInProgress:0 #830

Closed Al3xDaniels closed 1 month ago

Al3xDaniels commented 2 months ago

After deployment of the infra I am seeing these 202 responses during the web app deployment. After 2 hours I aborted and was unable to resolve the issue.

Note numberOfInstancesInProgress": 0 - during the polling of the deployment this value never changes.


{
  "id": "/subscriptions/{subid}/resourceGroups/infoasst-cognitive-assistant-xvii/providers/Microsoft.Web/sites/infoasst-enrichmentweb-jdsfh/deploymentStatus/{subid}",
  "name": "{subid}",
  "type": "Microsoft.Web/sites/deploymentStatus",
  "location": "East US 2",
  "tags": {
    "BuildNumber": "local",
    "ProjectName": "Information Assistant"
  },
  "properties": {
    "deploymentId": "{deploymentid}",
    "status": "BuildInProgress",
    "numberOfInstancesInProgress": 0,
    "numberOfInstancesSuccessful": 0,
    "numberOfInstancesFailed": 0,
    "failedInstancesLogs": null,
    "errors": null
  }
}

Looking in the deployment center I am seeing this in the logs


Copying files to destination directory '/tmp/_preCompressedDestinationDir'...
Done in 18 sec(s).
Compressing content of directory '/tmp/_preCompressedDestinationDir'...
/tmp/BuildScriptGenerator/c5341db81b8748d3a2bb3cb918b6a9f6/build.sh: line 433:   381 Killed                  tar -zcf "$DESTINATION_DIR/output.tar.gz" .
/tmp/BuildScriptGenerator/c5341db81b8748d3a2bb3cb918b6a9f6/build.sh: line 433:   381 Killed                  tar -zcf "$DESTINATION_DIR/output.tar.gz" .\n/bin/bash -c "oryx build /tmp/zipdeploy/extracted -o /home/site/wwwroot --platform python --platform-version 3.10 -p virtualenv_name=antenv --log-file /tmp/build-debug.log  -i /tmp/8dcc2d3e79f76d8 --compress-destination-dir | tee /tmp/oryx-build.log ; exit $PIPESTATUS
``` "
Al3xDaniels commented 2 months ago

Update: I was able to deploy but only after changing the app service sku all the way up to P3mv3.

dayland commented 2 months ago

That is unexpected. However this could be related to regional Azure capacity limitations. The errors above are occurring in the Oryx build that happens within the Azure App Service.

We will be moving away from these Oryx builds in our next release to avoid some of these inconsistencies and errors with letting the Azure App Service do the builds unattended.

neurartbcc commented 2 months ago

Had this same issue using gpt4o in .env. After reading this and issue #744, just switched to 3.5 turbo (" " in .env) and it worked! It does take around 500 sec of retrys for the Standard App tier.

dayland commented 2 months ago

For gpt-40 there are code changes required along with config changes in the current release. We will be including OOTB gpt-4o support in the next release.

wtomaz808 commented 2 months ago

I too have had this issue: the deployment of the infra succeeds but the app services for web & web enhancements do not complete. It hangs until I cancel the deploy.

I did change the sku of the app pools to P3mv3, but that did not resolve the issue.

wtomaz808 commented 1 month ago

I attempted the deployment again today in East US 2. the infra deploy with no issues but the web app builds hangs for hours... /resourceGroups/infoasst-aoai808e4/providers/Microsoft.Web/sites/infoasst-enrichmentweb-xnxyp/deploymentStatus/6cca3015-9883-4bd5-b4fd-f52af7b4da5b","name":"6cca3015-9883-4bd5-b4fd-f52af7b4da5b","type":"Microsoft.Web/sites/deploymentStatus","location":"East US 2","tags":{"BuildNumber":"local","ProjectName":"Information Assistant"},"properties":{"deploymentId":"6cca3015-9883-4bd5-b4fd-f52af7b4da5b","status":"BuildInProgress","numberOfInstancesInProgress":0,"numberOfInstancesSuccessful":0,"numberOfInstancesFailed":0,"failedInstancesLogs":null,"errors":null}} Status: Building the app... Time: 3343(s)

wtomaz808 commented 1 month ago

here are the trace logs from the web app start up..............

2024-09-05T18:02:35.200574235Z A P P S E R V I C E O N L I N U X 2024-09-05T18:02:35.200579464Z 2024-09-05T18:02:35.200584343Z Documentation: http://aka.ms/webapp-linux 2024-09-05T18:02:35.200589363Z Python 3.10.14 2024-09-05T18:02:35.200594212Z Note: Any data outside '/home' is not persisted 2024-09-05T18:02:35.489663969Z Starting OpenBSD Secure Shell server: sshd. 2024-09-05T18:02:35.492220771Z WEBSITES_INCLUDE_CLOUD_CERTS is not set to true. 2024-09-05T18:02:35.505889428Z Site's appCommandLine: gunicorn --workers 2 --worker-class uvicorn.workers.UvicornWorker app:app --timeout 600 2024-09-05T18:02:35.506113971Z Launching oryx with: create-script -appPath /home/site/wwwroot -output /opt/startup/startup.sh -virtualEnvName antenv -defaultApp /opt/defaultsite -userStartupCommand 'gunicorn --workers 2 --worker-class uvicorn.workers.UvicornWorker app:app --timeout 600' 2024-09-05T18:02:35.517056853Z Could not find build manifest file at '/home/site/wwwroot/oryx-manifest.toml' 2024-09-05T18:02:35.517086027Z Could not find operation ID in manifest. Generating an operation id... 2024-09-05T18:02:35.517094493Z Build Operation ID: 2c99f912-e2a5-43ee-8dbe-141ce3fe47ce 2024-09-05T18:02:35.565719567Z Oryx Version: 0.2.20240619.2, Commit: cf006407a02b225f59dccd677986973c7889aa50, ReleaseTagName: 20240619.2 2024-09-05T18:02:35.573890771Z Writing output script to '/opt/startup/startup.sh' 2024-09-05T18:02:35.593139032Z WARNING: Could not find virtual environment directory /home/site/wwwroot/antenv. 2024-09-05T18:02:35.593165151Z WARNING: Could not find package directory /home/site/wwwroot/__oryx_packages. 2024-09-05T18:02:35.730630553Z 2024-09-05T18:02:35.730666480Z Error: class uri 'uvicorn.workers.UvicornWorker' invalid or not found: 2024-09-05T18:02:35.730673704Z 2024-09-05T18:02:35.730678924Z [Traceback (most recent call last): 2024-09-05T18:02:35.730684394Z File "/opt/python/3.10.14/lib/python3.10/site-packages/gunicorn/util.py", line 111, in load_class 2024-09-05T18:02:35.730689704Z mod = importlib.import_module('.'.join(components)) 2024-09-05T18:02:35.730694623Z File "/opt/python/3.10.14/lib/python3.10/importlib/init__.py", line 126, in import_module 2024-09-05T18:02:35.730699682Z return _bootstrap._gcd_import(name[level:], package, level) 2024-09-05T18:02:35.730704601Z File "", line 1050, in _gcd_import 2024-09-05T18:02:35.730711554Z File "", line 1027, in _find_and_load 2024-09-05T18:02:35.730736601Z File "", line 992, in _find_and_load_unlocked 2024-09-05T18:02:35.730745889Z File "", line 241, in _call_with_frames_removed 2024-09-05T18:02:35.730751239Z File "", line 1050, in _gcd_import 2024-09-05T18:02:35.730756208Z File "", line 1027, in _find_and_load 2024-09-05T18:02:35.730762650Z File "", line 1004, in _find_and_load_unlocked 2024-09-05T18:02:35.730771557Z ModuleNotFoundError: No module named 'uvicorn' 2024-09-05T18:02:35.730780193Z ] 2024-09-05T18:02:35.730787066Z

2024-09-05T18:02:35.309Z INFO - Pulling image: mcr.microsoft.com/appsvc/middleware:stage5 2024-09-05T18:02:35.393Z INFO - stage5 Pulling from appsvc/middleware 2024-09-05T18:02:35.398Z INFO - Digest: sha256:8e9607c570710da32f77ba51a39b1ee1ddf3da19ba49f081b7508d6827cfa10a 2024-09-05T18:02:35.401Z INFO - Status: Image is up to date for mcr.microsoft.com/appsvc/middleware:stage5 2024-09-05T18:02:35.405Z INFO - Pull Image successful, Time taken: 0 Seconds 2024-09-05T18:02:35.425Z INFO - Starting container for site 2024-09-05T18:02:35.426Z INFO - docker run -d --expose=8181 --name infoasst-web-xnxyp_0_73d6d783_middleware -e WEBSITE_CORS_ALLOWED_ORIGINS=https://portal.azure.com,https://ms.portal.azure.com -e WEBSITE_CORS_SUPPORT_CREDENTIALS=False -e WEBSITE_SITE_NAME=infoasst-web-xnxyp -e WEBSITE_AUTH_ENABLED=True -e WEBSITE_ROLE_INSTANCE_ID=0 -e WEBSITE_HOSTNAME=infoasst-web-xnxyp.azurewebsites.net -e WEBSITE_INSTANCE_ID=eb4f0d80454269e4f9939b3b20419d3b71fd73c0f173f47195fd01a4c4897f31 -e HTTP_LOGGING_ENABLED=1 mcr.microsoft.com/appsvc/middleware:stage5 /Host.ListenUrl=http://0.0.0.0:8181 /Host.D2024-09-05T18:02:35.636Z INFO - Initiating warmup request to container infoasst-web-xnxyp_0_73d6d783_msiProxy for site infoasst-web-xnxyp 2024-09-05T18:02:35.643Z INFO - Container infoasst-web-xnxyp_0_73d6d783_msiProxy for site infoasst-web-xnxyp initialized successfully and is ready to serve requests. 2024-09-05T18:02:35.644Z INFO - Initiating warmup request to container infoasst-web-xnxyp_0_73d6d783 for site infoasst-web-xnxyp 2024-09-05T18:02:36.653Z ERROR - Container infoasst-web-xnxyp_0_73d6d783 for site infoasst-web-xnxyp has exited, failing site start 2024-09-05T18:02:36.654Z INFO - Initiating warmup request to container infoasst-web-xnxyp_0_73d6d783_middleware for site infoasst-web-xnxyp 2024-09-05T18:02:36.759Z INFO - Container infoasst-web-xnxyp_0_73d6d783_middleware for site infoasst-web-xnxyp initialized successfully and is ready to serve requests. 2024-09-05T18:02:36.763Z ERROR - Container infoasst-web-xnxyp_0_73d6d783 didn't respond to HTTP pings on port: 8000, failing site start. See container logs for debugging. 2024-09-05T18:02:36.767Z INFO - Stopping site infoasst-web-xnxyp because it failed during startup.