Open tdeheurles opened 10 months ago
:wave: @tdeheurles Thanks for filing this bug report.
A project maintainer will review this report and get back to you soon. If you'd like immediate help troubleshooting, please visit our Discord server.
For more information on our triage process please visit our triage overview
⚠️Here is the procedure A which fails. ⚠️
Note that:
❯ rad deploy application.bicep
Building application.bicep...
Deploying template 'application.bicep' into environment 'demo5env' from workspace 'demo5workspace'...
Deployment In Progress...
Deployment Complete
Resources:
myapp Applications.Core/applications
❯ k get ns
NAME STATUS AGE
...
demo5env-myapp Active 26s
...
❯ k get all -n demo5env-myapp
No resources found in demo5env-myapp namespace.
❯ rad deploy backend.bicep
Building backend.bicep...
Deploying template 'backend.bicep' into environment 'demo5env' from workspace 'demo5workspace'...
Deployment In Progress...
.. backend Applications.Core/containers
Error: {
"code": "DeploymentFailed",
"message": "At least one resource deployment operation failed. Please see the details for the specific operation that failed.",
"target": "/planes/radius/local/resourceGroups/demo5group/providers/Microsoft.Resources/deployments/rad-deploy-314d5b2b-5aa1-43b6-b223-728e3d437f3c",
"details": [
{
"code": "ResourceDeploymentFailure",
"message": "Failed",
"target": "/planes/radius/local/resourceGroups/demo5group/providers/Applications.Core/containers/backend",
"details": [
{
"code": "Internal",
"message": "Container state is 'Terminated' Reason: Error, Message: "
}
]
}
]
}
TraceId: 3254e16c30a69614eb65de701d793ded
--> the deployment fail as the container code is failing
❯ rad app connections myapp
Displaying application: myapp
Name: backend (Applications.Core/containers)
Connections: (none)
Resources: (none)
❯ k get all -n demo5env-myapp
NAME READY STATUS RESTARTS AGE
pod/backend-c565568bc-5s7gc 0/1 CrashLoopBackOff 4 (12s ago) 92s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/backend 0/1 1 0 92s
NAME DESIRED CURRENT READY AGE
replicaset.apps/backend-c565568bc 1 1 0 92s
❯ rad app delete myapp
Application myapp deleted
❯ k get ns
NAME STATUS AGE
default Active 5d23h
default-application Active 45h
default-rad Active 45h
demo5env-myapp Active 3m3s
demo6env Active 20h
kube-node-lease Active 5d23h
kube-public Active 5d23h
kube-system Active 5d23h
radius-system Active 4d22h
❯ k get all -n demo5env-myapp
NAME READY STATUS RESTARTS AGE
pod/backend-c565568bc-5s7gc 0/1 CrashLoopBackOff 4 (51s ago) 2m11s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/backend 0/1 1 0 2m11s
NAME DESIRED CURRENT READY AGE
replicaset.apps/backend-c565568bc 1 1 0 2m11s
✅Here is the procedure B which succeed✅
I moved the container resource into the same bicep file as the application.
Note:
❯ k delete ns demo5env-myapp
namespace "demo5env-myapp" deleted
❯ rad app show myapp -o json
The application "myapp" was not found or has been deleted.
❯ rad deploy application.bicep
Building application.bicep...
Deploying template 'application.bicep' into environment 'demo5env' from workspace 'demo5workspace'...
Deployment In Progress...
Completed myapp Applications.Core/applications
.. backend Applications.Core/containers
Deployment Complete
Resources:
myapp Applications.Core/applications
backend Applications.Core/containers
❯ rad app connections myapp
Displaying application: myapp
Name: backend (Applications.Core/containers)
Connections: (none)
Resources:
backend (apps/Deployment)
backend (core/Service)
backend (core/ServiceAccount)
backend (rbac.authorization.k8s.io/Role)
backend (rbac.authorization.k8s.io/RoleBinding)
❯ k get all -n demo5env-myapp
NAME READY STATUS RESTARTS AGE
pod/backend-c565568bc-2fmv4 0/1 Error 3 (29s ago) 47s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/backend ClusterIP 10.100.88.16 <none> 8080/TCP 46s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/backend 0/1 1 0 47s
NAME DESIRED CURRENT READY AGE
replicaset.apps/backend-c565568bc 1 1 0 47s
❯ rad app delete myapp
Application myapp deleted
❯ k get ns
NAME STATUS AGE
...
demo5env-myapp Active 102s
...
❯ k get all -n demo5env-myapp
No resources found in demo5env-myapp namespace.
:+1: We've reviewed this issue and have agreed to add it to our backlog. Please subscribe to this issue for notifications, we'll provide updates when we pick it up.
We also welcome community contributions! If you would like to pick this item up sooner and submit a pull request, please visit our contribution guidelines and assign this to yourself by commenting "/assign" on this issue.
For more information on our triage process please visit our triage overview
@vishwahiremat, do you mean the "link" is not done as the "deployment of the resource" is failing when deployed separately ?
Yes and I have seen this issue when application and the container are deployed together as well. And I believe its not related to deploying the resources separately.
I deployed a bicep with an invalid image for container. The deployment fails as expected.
nithya@Nithyas-MacBook-Pro ~ % rad deploy ~/Desktop/a.bicep --parameters magpieimage=ghcr.io/radius-project/magpiegoo:latest
Building /Users/nithya/Desktop/a.bicep...
Deploying template '/Users/nithya/Desktop/a.bicep' for application 'nithya' and environment 'default' from workspace 'default'...
Deployment In Progress...
Completed corerp-resources-gateway Applications.Core/applications
Completed http-gtwy-back-rte Applications.Core/httpRoutes
... http-gtwy-front-ctnr Applications.Core/containers
... http-gtwy-back-ctnr Applications.Core/containers
Error: {
"code": "DeploymentFailed",
"message": "At least one resource deployment operation failed. Please see the details for the specific operation that failed.",
"target": "/planes/radius/local/resourceGroups/default/providers/Microsoft.Resources/deployments/rad-deploy-534a0c1d-8638-41e1-94d1-5657c0bae2b8",
"details": [
{
"code": "ResourceDeploymentFailure",
"message": "Failed",
"target": "/planes/radius/local/resourceGroups/default/providers/Applications.Core/containers/http-gtwy-front-ctnr",
"details": [
{
"code": "Internal",
"message": "Container state is 'Waiting' Reason: ErrImagePull, Message: rpc error: code = Unknown desc = failed to pull and unpack image \"ghcr.io/radius-project/magpiegoo:latest\": failed to resolve reference \"ghcr.io/radius-project/magpiegoo:latest\": failed to authorize: failed to fetch anonymous token: unexpected status: 403 Forbidden"
}
]
},
{
"code": "OK",
"message": "",
"target": "/planes/radius/local/resourceGroups/default/providers/Applications.Core/applications/corerp-resources-gateway"
},
{
"code": "OK",
"message": "",
"target": "/planes/radius/local/resourceGroups/default/providers/Applications.Core/httpRoutes/http-gtwy-back-rte"
},
{
"code": "ResourceDeploymentFailure",
"message": "Failed",
"target": "/planes/radius/local/resourceGroups/default/providers/Applications.Core/containers/http-gtwy-back-ctnr",
"details": [
{
"code": "Internal",
"message": "Container state is 'Waiting' Reason: ErrImagePull, Message: rpc error: code = Unknown desc = failed to pull and unpack image \"ghcr.io/radius-project/magpiegoo:latest\": failed to resolve reference \"ghcr.io/radius-project/magpiegoo:latest\": failed to authorize: failed to fetch anonymous token: unexpected status: 403 Forbidden"
}
]
}
]
}
TraceId: 3fdfe67f6f20198aff86743f7993239f
nithya@Nithyas-MacBook-Pro ~ % k get all -n default-corerp-resources-gateway
NAME READY STATUS RESTARTS AGE
pod/http-gtwy-back-ctnr-86b6cf5b8c-fmnp2 1/1 Running 1 (4d8h ago) 13d
pod/http-gtwy-back-ctnr-c498f6457-l69wr 0/1 ImagePullBackOff 0 3m51s
pod/http-gtwy-front-ctnr-67c8bfcb56-vrwmk 0/1 ImagePullBackOff 0 3m51s
pod/http-gtwy-front-ctnr-d84df4977-srzm5 1/1 Running 1 (4d8h ago) 13d
nithya@Nithyas-MacBook-Pro ~ % rad app delete corerp-resources-gateway
Application corerp-resources-gateway deleted
nithya@Nithyas-MacBook-Pro ~ % k get all -n default-corerp-resources-gateway
No resources found in default-corerp-resources-gateway namespace.
I cannot reproduce the issue, when resources are deployed together. My cli version is
nithya@Nithyas-MacBook-Pro ~ % rad version
RELEASE VERSION BICEP COMMIT
0.29.0 v0.29.0 0.29.0 6abd7bfc3de0e748a2c34b721d95097afb6a2bba
I will try deploying resources separately and get back with an update.
I think this is related to deployment failure, than the app definition spawning multiple files.
I am sharing the logs of interest between the case where I deploy an app with a container pointing to invalid image versus an app with valid container definition.
Logs from Application RP while rad app delete
on an application whose deployment failed (due to invalid image in container spec):
2024-01-25T15:22:12.048-0800 INFO radius.radiusasyncworker worker/worker.go:252 Start processing operation. {"serviceName": "radius", "version": "edge", "hostName": "Nithyas-MacBook-Pro.local", "resourceId": "/planes/radius/local/resourcegroups/randomgroup/providers/Applications.Core/containers/allinonecontainer", "operationId": "f4d4e41b-790b-47c0-972e-0721939b1582", "operationType": "APPLICATIONS.CORE/CONTAINERS|DELETE", "dequeueCount": 1, "traceId": "1e9e7db9db2c0263a8690f7d8be5eb76", "spanId": "742538235db79936"}
2024-01-25T15:22:12.065-0800 INFO radius.radiusasyncworker worker/worker.go:260 Operation returned {"serviceName": "radius", "version": "edge", "hostName": "Nithyas-MacBook-Pro.local", "resourceId": "/planes/radius/local/resourcegroups/randomgroup/providers/Applications.Core/containers/allinonecontainer", "operationId": "f4d4e41b-790b-47c0-972e-0721939b1582", "operationType": "APPLICATIONS.CORE/CONTAINERS|DELETE", "dequeueCount": 1, "traceId": "1e9e7db9db2c0263a8690f7d8be5eb76", "spanId": "742538235db79936", "success": true, "code": "", "provisioningState": "Succeeded", "err": null}
2024-01-25T15:22:12.070-0800 INFO radius.radiusasyncworker worker/worker.go:360 failed to update the provisioningState in resource because it no longer exists. {"serviceName": "radius", "version": "edge", "hostName": "Nithyas-MacBook-Pro.local", "resourceId": "/planes/radius/local/resourcegroups/randomgroup/providers/Applications.Core/containers/allinonecontainer", "operationId": "f4d4e41b-790b-47c0-972e-0721939b1582", "operationType": "APPLICATIONS.CORE/CONTAINERS|DELETE", "dequeueCount": 1, "traceId": "1e9e7db9db2c0263a8690f7d8be5eb76", "spanId": "742538235db79936"}
Logs from Application RP while rad app delete
on an application whose deployment was successful :
2024-01-25T15:30:08.301-0800 INFO radius.radiusasyncworker worker/worker.go:252 Start processing operation. {"serviceName": "radius", "version": "edge", "hostName": "Nithyas-MacBook-Pro.local", "resourceId": "/planes/radius/local/resourcegroups/randomgroup/providers/Applications.Core/containers/allinonecontainer", "operationId": "bfe96eaf-be67-4e90-b42b-6ed303658a98", "operationType": "APPLICATIONS.CORE/CONTAINERS|DELETE", "dequeueCount": 1, "traceId": "929cdac227dad85290c14b398028ada2", "spanId": "926b421d1b3df760"}
2024-01-25T15:30:08.306-0800 INFO radius.radiusasyncworker deployment/deploymentprocessor.go:341 Deleting output resource: LocalID: Service, resource type: "Provider: kubernetes, Type: core/Service"
{"serviceName": "radius", "version": "edge", "hostName": "Nithyas-MacBook-Pro.local", "resourceId": "/planes/radius/local/resourcegroups/randomgroup/providers/Applications.Core/containers/allinonecontainer", "operationId": "bfe96eaf-be67-4e90-b42b-6ed303658a98", "operationType": "APPLICATIONS.CORE/CONTAINERS|DELETE", "dequeueCount": 1, "traceId": "929cdac227dad85290c14b398028ada2", "spanId": "926b421d1b3df760"}
2024-01-25T15:30:08.341-0800 INFO radius.radiusasyncworker deployment/deploymentprocessor.go:341 Deleting output resource: LocalID: Deployment, resource type: "Provider: kubernetes, Type: apps/Deployment"
{"serviceName": "radius", "version": "edge", "hostName": "Nithyas-MacBook-Pro.local", "resourceId": "/planes/radius/local/resourcegroups/randomgroup/providers/Applications.Core/containers/allinonecontainer", "operationId": "bfe96eaf-be67-4e90-b42b-6ed303658a98", "operationType": "APPLICATIONS.CORE/CONTAINERS|DELETE", "dequeueCount": 1, "traceId": "929cdac227dad85290c14b398028ada2", "spanId": "926b421d1b3df760"}
2024-01-25T15:30:08.357-0800 INFO radius.radiusapi rest/results.go:75 responding with status code: 200 {"serviceName": "radius", "version": "edge", "hostName": "Nithyas-MacBook-Pro.local", "resourceId": "/planes/radius/local/resourcegroups/randomgroup/providers/applications.core/containers/allinonecontainer", "traceId": "929cdac227dad85290c14b398028ada2", "spanId": "e1011fc68f9c554a", "statusCode": 200}
2024-01-25T15:30:08.376-0800 INFO radius.radiusasyncworker deployment/deploymentprocessor.go:341 Deleting output resource: LocalID: KubernetesRoleBinding, resource type: "Provider: kubernetes, Type: rbac.authorization.k8s.io/RoleBinding"
{"serviceName": "radius", "version": "edge", "hostName": "Nithyas-MacBook-Pro.local", "resourceId": "/planes/radius/local/resourcegroups/randomgroup/providers/Applications.Core/containers/allinonecontainer", "operationId": "bfe96eaf-be67-4e90-b42b-6ed303658a98", "operationType": "APPLICATIONS.CORE/CONTAINERS|DELETE", "dequeueCount": 1, "traceId": "929cdac227dad85290c14b398028ada2", "spanId": "926b421d1b3df760"}
2024-01-25T15:30:08.409-0800 INFO radius.radiusasyncworker deployment/deploymentprocessor.go:341 Deleting output resource: LocalID: KubernetesRole, resource type: "Provider: kubernetes, Type: rbac.authorization.k8s.io/Role"
{"serviceName": "radius", "version": "edge", "hostName": "Nithyas-MacBook-Pro.local", "resourceId": "/planes/radius/local/resourcegroups/randomgroup/providers/Applications.Core/containers/allinonecontainer", "operationId": "bfe96eaf-be67-4e90-b42b-6ed303658a98", "operationType": "APPLICATIONS.CORE/CONTAINERS|DELETE", "dequeueCount": 1, "traceId": "929cdac227dad85290c14b398028ada2", "spanId": "926b421d1b3df760"}
2024-01-25T15:30:08.448-0800 INFO radius.radiusasyncworker deployment/deploymentprocessor.go:341 Deleting output resource: LocalID: ServiceAccount, resource type: "Provider: kubernetes, Type: core/ServiceAccount"
{"serviceName": "radius", "version": "edge", "hostName": "Nithyas-MacBook-Pro.local", "resourceId": "/planes/radius/local/resourcegroups/randomgroup/providers/Applications.Core/containers/allinonecontainer", "operationId": "bfe96eaf-be67-4e90-b42b-6ed303658a98", "operationType": "APPLICATIONS.CORE/CONTAINERS|DELETE", "dequeueCount": 1, "traceId": "929cdac227dad85290c14b398028ada2", "spanId": "926b421d1b3df760"}
2024-01-25T15:30:08.507-0800 INFO radius.radiusasyncworker worker/worker.go:260 Operation returned {"serviceName": "radius", "version": "edge", "hostName": "Nithyas-MacBook-Pro.local", "resourceId": "/planes/radius/local/resourcegroups/randomgroup/providers/Applications.Core/containers/allinonecontainer", "operationId": "bfe96eaf-be67-4e90-b42b-6ed303658a98", "operationType": "APPLICATIONS.CORE/CONTAINERS|DELETE", "dequeueCount": 1, "traceId": "929cdac227dad85290c14b398028ada2", "spanId": "926b421d1b3df760", "success": true, "code": "", "provisioningState": "Succeeded", "err": null}
2024-01-25T15:30:08.512-0800 INFO radius.radiusasyncworker worker/worker.go:360 failed to update the provisioningState in resource because it no longer exists. {"serviceName": "radius", "version": "edge", "hostName": "Nithyas-MacBook-Pro.local", "resourceId": "/planes/radius/local/resourcegroups/randomgroup/providers/Applications.Core/containers/allinonecontainer", "operationId": "bfe96eaf-be67-4e90-b42b-6ed303658a98", "operationType": "APPLICATIONS.CORE/CONTAINERS|DELETE", "dequeueCount": 1, "traceId": "929cdac227dad85290c14b398028ada2", "spanId": "926b421d1b3df760"}
the bunch of outputResources (middle block) in successful case is missing from the case where deployment has failed.
I think we are populating radius outputResources only in case where the deployment has been successful (https://github.com/project-radius/radius/blob/68c47eaac742be051a59ead00683bb48426d345f/pkg/corerp/backend/deployment/deploymentprocessor.go) and are hence unable to clean up outputResources in case where deployment has failed. I will look further into the code.
Deployment failed:
nithya@Nithyas-MacBook-Pro bug7052 % rad resource list containers -a allinoneapp -o json
[
{
"id": "/planes/radius/local/resourcegroups/randomgroup/providers/Applications.Core/containers/allinonecontainer",
"location": "global",
"name": "allinonecontainer",
"properties": {
"application": "/planes/radius/local/resourcegroups/randomgroup/providers/Applications.Core/applications/allinoneapp",
"connections": {},
"container": {
"args": [],
"command": [],
"env": {},
"image": "ghcr.io/radius-project/magpiegoo:latest",
"ports": {
"web": {
"containerPort": 3000,
"protocol": "TCP",
"provides": ""
}
},
"workingDir": ""
},
"provisioningState": "Failed",
"resourceProvisioning": "internal",
"resources": [],
"status": {}
},
"systemData": {
"createdAt": "0001-01-01T00:00:00Z",
"createdBy": "",
"createdByType": "",
"lastModifiedAt": "0001-01-01T00:00:00Z",
"lastModifiedBy": "",
"lastModifiedByType": ""
},
"tags": {},
"type": "Applications.Core/containers"
}
]
Deployment succeeded
nithya@Nithyas-MacBook-Pro bug7052 % rad resource list containers -a allinoneapp -o json
[
{
"id": "/planes/radius/local/resourcegroups/randomgroup/providers/Applications.Core/containers/allinonecontainer",
"location": "global",
"name": "allinonecontainer",
"properties": {
"application": "/planes/radius/local/resourcegroups/randomgroup/providers/Applications.Core/applications/allinoneapp",
"connections": {},
"container": {
"args": [],
"command": [],
"env": {},
"image": "ghcr.io/radius-project/magpiego:latest",
"ports": {
"web": {
"containerPort": 3000,
"port": 3000,
"protocol": "TCP",
"provides": ""
}
},
"workingDir": ""
},
"provisioningState": "Succeeded",
"resourceProvisioning": "internal",
"resources": [],
"status": {
"outputResources": [
{
"id": "/planes/kubernetes/local/namespaces/allinoneenv-allinoneapp/providers/core/ServiceAccount/allinonecontainer",
"localId": "ServiceAccount"
},
{
"id": "/planes/kubernetes/local/namespaces/allinoneenv-allinoneapp/providers/rbac.authorization.k8s.io/Role/allinonecontainer",
"localId": "KubernetesRole"
},
{
"id": "/planes/kubernetes/local/namespaces/allinoneenv-allinoneapp/providers/rbac.authorization.k8s.io/RoleBinding/allinonecontainer",
"localId": "KubernetesRoleBinding"
},
{
"id": "/planes/kubernetes/local/namespaces/allinoneenv-allinoneapp/providers/apps/Deployment/allinonecontainer",
"localId": "Deployment"
},
{
"id": "/planes/kubernetes/local/namespaces/allinoneenv-allinoneapp/providers/core/Service/allinonecontainer",
"localId": "Service"
}
]
}
},
"systemData": {
"createdAt": "0001-01-01T00:00:00Z",
"createdBy": "",
"createdByType": "",
"lastModifiedAt": "0001-01-01T00:00:00Z",
"lastModifiedBy": "",
"lastModifiedByType": ""
},
"tags": {},
"type": "Applications.Core/containers"
}
]
Steps to reproduce
ℹ️ This issue is the continuity of discord forum with @vishwahiremat
I have created an environment, application and container in 3 different files. I then have applied them one by one. Finally I decided I didn't want application and container anymore so I run rad application delete myapp. The application seems removed but not the container which was part of it.
Here are some of the files:
Observed behavior
The application is deleted while the k8s.deployment is not.
Desired behavior
We should never have orphan resources.
Workaround
To me it seems the problems appears when we deploy the application and the resource with sepparate commands.
But @vishwahiremat thinks it's not the case. For him the issue is linked to the failing container deployment.
I'm surprised as, if I understand correctly, if the application and the container are deployed together, the container is still fialing but deleting the application succeed at deleting the service.
@vishwahiremat, do you mean the "link" is not done as the "deployment of the resource" is failing when deployed separately ?
rad Version
Operating system
WSL2
Additional context
No response
Would you like to support us?
AB#10954