Open MurzNN opened 9 months ago
And here is the output of the all resources, related to the process, while I see the "Deployment is not ready" error:
$ kubectl -n korepov get all | grep pv-migrate
pod/pv-migrate-dbddb-src-sshd-cf79c787-d2nph 1/1 Running 0 18s
service/pv-migrate-dbddb-src-sshd NodePort 10.233.18.8 <none> 22:32148/TCP 20s
deployment.apps/pv-migrate-dbddb-src-sshd 1/1 1 1 19s
replicaset.apps/pv-migrate-dbddb-src-sshd-cf79c787 1 1 1 19s
This looks like a bug, I'll have a look. You can get more info by --log-level=debug --log-format=json
, but not sure if it's gonna help here.
Thanks! I already have --log-level=debug
and --log-format=json
just adds more garbage to the output, but not new useful information ;)
Maybe you can explain how to debug this on my side? And I will share more debugging information for you.
I had a look and noticed that this error comes from Helm's wait logic, not from our code. So I would give a try to pass --skip-cleanup
and try to troubleshoot it using helm
cli, trying to find out why it does not report as ready.
You can give a try to
helm ls -a
helm status <name-of-the-release>
Also, note that for lbsvc
, Helm would wait for the created Service
to actually get an external IP (not pending). This could be the problem.
Tested, even without --skip-cleanup
- it shows as deployed, while in the terminal I see coming lines:
Deployment is not ready: korepov/pv-migrate-dcada-src-sshd. 0 out of 1 expected pods are ready
Here is the output of helm:
$ helm status pv-migrate-dcada-src
NAME: pv-migrate-dcada-src
LAST DEPLOYED: Wed Dec 13 15:12:42 2023
NAMESPACE: korepov
STATUS: deployed
REVISION: 1
TEST SUITE: None
Seems this problem is related to the NodePort
service type mode. I can't test it with LoadBalancer
type because no free IPs are available for it on the source cluster.
But I tested on the destination cluster (just test the copy back), and with LoadBalancer
it works well, but with NodePort
I'm receiving the same error.
While the pv-migrate waits for readiness, I see the Service in the active state, here are the details:
$ kubectl describe service pv-migrate-bdaea-src-sshd
Name: pv-migrate-bdaea-src-sshd
Namespace: korepov-pro-dev
Labels: app.kubernetes.io/component=sshd
app.kubernetes.io/instance=pv-migrate-bdaea-src
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=pv-migrate
app.kubernetes.io/version=0.5.0
helm.sh/chart=pv-migrate-0.5.0
Annotations: meta.helm.sh/release-name: pv-migrate-bdaea-src
meta.helm.sh/release-namespace: korepov-pro-dev
Selector: app.kubernetes.io/component=sshd,app.kubernetes.io/instance=pv-migrate-bdaea-src,app.kubernetes.io/name=pv-migrate
Type: NodePort
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.233.53.90
IPs: 10.233.53.90
Port: ssh 22/TCP
TargetPort: 22/TCP
NodePort: ssh 31784/TCP
Endpoints: 10.233.74.26:22
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>
And I can connect to this node port on the source cluster from the destination cluster (using the externtal IP of any node) via telnet:
# telnet 1.2.3.4 31784
Trying 1.2.3.4...
Connected to 1.2.3.4.
Escape character is '^]'.
SSH-2.0-OpenSSH_9.3
So, the network connection is not a problem.
So, could you please describe what exactly it tries to wait? And maybe make the more verbose debug logging to cath it?
Also, specifying the source node IP address explicitly using --dest-host-override 1.2.3.4
doesn't help too.
And will be good to add to the debug logs the output of the Helm chart deployment status, at least helm status
, but better - also the pod and service status.
Describe the bug When I start the pv-migrate, it creates the deployment, but in the debug log I see errors like:
But at the same time, via kubectl I see that the deployment is ready:
The log level is debug, and no additional messages were displayed.
So, any ideas on what can cause this problem?
How can I enable more verbose logging to understand what's happening and why it is not detecting the ready status?
Console output