Three times today the cloud-deployed web app ceased functioning:
- the browser got an "Oh no!" error page,
- the log indicated readiness probe failures, usually looking like
[12:49:52] ! Streamlit server consistently failed status checks
[12:49:52] ! Please fix the errors, push an update to the git repo, or reboot the app.
but on one occasion preceded by
[13:43:43] ! The service has encountered an error while checking the health of the Streamlit app: Get "http://localhost:8501/healthz": read tcp 10.12.171.54:46114->10.12.171.54:8501: read: connection reset by peer
- but no indication of having hit resource limits,
- and pushing an update went through successfully (as confirmed by a log entry) but did not revive the deployment.
(Only a reboot did, which rebuilds the "VM" from the ground up, pipenv dependency installation and all.)
In other words, the web server component on port 8501 had repeatedly died in mid-operation, and we do not know what (if anything) had happened behind it in the streamlit Python process and in the application script.
Three times today the cloud-deployed web app ceased functioning: - the browser got an "Oh no!" error page, - the log indicated readiness probe failures, usually looking like
but on one occasion preceded by
- but no indication of having hit resource limits, - and pushing an update went through successfully (as confirmed by a log entry) but did not revive the deployment. (Only a reboot did, which rebuilds the "VM" from the ground up, pipenv dependency installation and all.)
In other words, the web server component on port 8501 had repeatedly died in mid-operation, and we do not know what (if anything) had happened behind it in the streamlit Python process and in the application script.
More investigation needed.