Closed cfregly closed 1 year ago
Have you checked this log?
CloudWatch > Log groups > /aws/lambda/sb-\
2022-10-21 05:41:37.466 7f66cf87-2ee9-4477-986f-be1892eb906a INFO TenantService - Handling Tenant Onboarding Status Changed
2022-10-21 05:41:37.466 7f66cf87-2ee9-4477-986f-be1892eb906a INFO TenantServiceDAL - TenantServiceDAL::getTenant ab59241d-c212-4543-ae6a-791f84e80d04
2022-10-21 05:41:37.549 7f66cf87-2ee9-4477-986f-be1892eb906a INFO TenantServiceDAL - TenantServiceDAL::getTenant exec 83
2022-10-21 05:41:37.549 7f66cf87-2ee9-4477-986f-be1892eb906a INFO TenantService - Updating tenant ab59241d-c212-4543-ae6a-791f84e80d04 onboarding status from deploying to deployed
2022-10-21 05:41:37.549 7f66cf87-2ee9-4477-986f-be1892eb906a INFO TenantServiceDAL - TenantServiceDAL::updateTenantOnboarding ab59241d-c212-4543-ae6a-791f84e80d04 deployed
2022-10-21 05:41:37.570 7f66cf87-2ee9-4477-986f-be1892eb906a INFO TenantServiceDAL - TenantServiceDAL::updateTenantOnboarding exec 21
END RequestId: 7f66cf87-2ee9-4477-986f-be1892eb906a
REPORT RequestId: 7f66cf87-2ee9-4477-986f-be1892eb906a Duration: 125.05 ms Billed Duration: 126 ms Memory Size: 512 MB Max Memory Used: 171 MB
The onboarding status of the Tenant changes, it is recorded here. If you haven't checked, you'd better check.
Hey @cfregly , did you get the chance to investigate the reason why your tenants moved to failed
?
i haven't found anything specific, no. I did notice that ECS is showing "In Progress..." even though it's stable. kicking ECS again with a fresh Docker build. maybe that will fix the status in the SaaS Boost UI.
When you say ECS is showing In Progress... what do you mean? That the Service(s) status is changed or that the Task status under the service has changed? This may be due to your tasks flapping. Are you sure tasks aren't shutting down and being relaunched by ECS? If you look at the tasks for a service, do you only see 2 (the original task def that CloudFormation created and the task def that replaced it when initial workload deployment happened) or do you see many? Are you pushing new images to the application's ECR repo? Does every CodePipeline succeed? If you look at the pipeline history for your tenants do you see any failures?
Closing due to inactivity. Please reopen if the problem is reproducible.
All of a sudden, some of my active tenants appear to be
Failed
,even though they are active and working properly.
CloudFormation shows no issues - and the tenant has been working for 1-2 months.
This happened to me a few months ago, but I ignored it and rebuilt the environment.
I am in this state currently. I will preserve the environment to help gather more info. Just let me know what you need and I'll copy/paste here.
The state of the system seems to be messed up. Not sure how things go this way.
Thanks!