kubeflow / katib

Automated Machine Learning on Kubernetes
https://www.kubeflow.org/docs/components/katib
Apache License 2.0
1.48k stars 439 forks source link

katib-db-manager is crashloop state #2425

Open gopinath3759 opened 1 week ago

gopinath3759 commented 1 week ago

[y000435a@cnndcmldatagw01b ~]$ kubectl logs katib-db-manager-84fd984c55-6dfjb -n kubeflow I0909 09:32:34.498659 1 db.go:32] Using MySQL E0909 09:32:44.501617 1 connection.go:40] Ping to Katib db failed: dial tcp: lookup katib-mysql: i/o timeout E0909 09:32:49.502564 1 connection.go:40] Ping to Katib db failed: dial tcp: lookup katib-mysql: i/o timeout E0909 09:32:54.503454 1 connection.go:40] Ping to Katib db failed: dial tcp: lookup katib-mysql: i/o timeout E0909 09:32:59.504052 1 connection.go:40] Ping to Katib db failed: dial tcp: lookup katib-mysql: i/o timeout E0909 09:33:04.504873 1 connection.go:40] Ping to Katib db failed: dial tcp: lookup katib-mysql: i/o timeout E0909 09:33:09.505948 1 connection.go:40] Ping to Katib db failed: dial tcp: lookup katib-mysql: i/o timeout E0909 09:33:14.506898 1 connection.go:40] Ping to Katib db failed: dial tcp: lookup katib-mysql: i/o timeout E0909 09:33:19.507794 1 connection.go:40] Ping to Katib db failed: dial tcp: lookup katib-mysql: i/o timeout E0909 09:33:24.508862 1 connection.go:40] Ping to Katib db failed: dial tcp: lookup katib-mysql: i/o timeout E0909 09:33:29.509754 1 connection.go:40] Ping to Katib db failed: dial tcp: lookup katib-mysql: i/o timeout E0909 09:33:34.510712 1 connection.go:40] Ping to Katib db failed: dial tcp: lookup katib-mysql: i/o timeout E0909 09:33:39.511441 1 connection.go:40] Ping to Katib db failed: dial tcp: lookup katib-mysql: i/o timeout E0909 09:33:44.512477 1 connection.go:40] Ping to Katib db failed: dial tcp: lookup katib-mysql: i/o timeout F0909 09:33:44.512594 1 main.go:104] Failed to open db connection: DB open failed: Timeout waiting for DB conn successfully opened. [y000435a@cnndcmldatagw01b ~]$ kubectl get pods -n kubeflow |grep -i mysql

[y000435a@cnndcmldatagw01b ~]$ kubectl logs katib-mysql-6975d6c6c4-sdntq -n kubedlow Error from server (NotFound): namespaces "kubedlow" not found [y000435a@cnndcmldatagw01b ~]$ kubectl logs katib-mysql-6975d6c6c4-sdntq -n kubeflow 2024-09-09 08:07:11+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 8.0.29-1.el8 started. 2024-09-09 08:07:12+00:00 [Note] [Entrypoint]: Switching to dedicated user 'mysql' 2024-09-09 08:07:12+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 8.0.29-1.el8 started. '/var/lib/mysql/mysql.sock' -> '/var/run/mysqld/mysqld.sock' 2024-09-09T08:07:12.868596Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.29) starting as process 1 2024-09-09T08:07:12.882504Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started. 2024-09-09T08:07:13.179649Z 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended. 2024-09-09T08:07:13.432224Z 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed. 2024-09-09T08:07:13.432279Z 0 [System] [MY-013602] [Server] Channel mysql_main configured to support TLS. Encrypted connections are now supported for this channel. 2024-09-09T08:07:13.435428Z 0 [Warning] [MY-011810] [Server] Insecure configuration for --pid-file: Location '/var/run/mysqld' in the path is accessible to all OS users. Consider choosing a different directory. 2024-09-09T08:07:13.472973Z 0 [System] [MY-011323] [Server] X Plugin ready for connections. Bind-address: '::' port: 33060, socket: /var/run/mysqld/mysqlx.sock 2024-09-09T08:07:13.473151Z 0 [System] [MY-010931] [Server] /usr/sbin/mysqld: ready for connections. Version: '8.0.29' socket: '/var/run/mysqld/mysqld.sock' port: 3306 MySQL Community Server - GPL. [y000435a@cnndcmldatagw01b ~]$ ^C [y000435a@cnndcmldatagw01b ~]$

Electronic-Waste commented 1 week ago

Hi @gopinath3759 Thank you for raising this issue! Can you tell us more information such as the version of Kubernetes and Kubeflow so that we can help you better? FYI, You can create a new issue or adjust the current issue with a bug report template:

截屏2024-09-09 18 44 39