Open ghost opened 4 years ago
Hi @lordcf
Is it possible for you to retest with a newer release of kubecf ? There is kubecf 2.2.3 in CAP 2.0.1.
Further, do I understand "deploy KubeCF on the cluster deployed on BOSH ( aws env)" correctly as
Or are you talking about
As kubecf is kubernetes-based I am very confused by the reference to BOSH in your description.
If you have the same issue with 2.2.3 as with 2.2.2 then it would be very helpful if you could provide the exact session and steps you used to deploy the cluster and kubecf for us to review.
@gak @satadruroy FYI
@lordcf there's a more recent version of KubeCF in case you wanna give it a try and see if your problem has been fixed there.
Hi @lordcf
Is it possible for you to retest with a newer release of kubecf ? There is kubecf 2.2.3 in CAP 2.0.1.
Further, do I understand "deploy KubeCF on the cluster deployed on BOSH ( aws env)" correctly as
- You are setting up a BOSH system on AWS, and then deploy kubecf into that BOSH system ?
Or are you talking about
- Setting up a kubernetes cluster on AWS, and deploying kubecf into that cluster ?
As kubecf is kubernetes-based I am very confused by the reference to BOSH in your description.
If you have the same issue with 2.2.3 as with 2.2.2 then it would be very helpful if you could provide the exact session and steps you used to deploy the cluster and kubecf for us to review.
@gak @satadruroy FYI
Hello @gak @satadruroy @fargozhu
We are using follwoing steps in order to install KubeCF:
1) Install bosh 2) deploy cluster on bosh 3) deploy cf-operator and kubecf on cluster
Tried Kubecf deployment with v2.2.3 and with v2.3.0, still the diego-api pod is going into "CrashLoopBackOff" state.
Getting below error:
$ kubectl logs diego-api-0 -n kubecf -c bbs-bbs
{"timestamp":"2020-10-15T08:45:28.039378053Z","level":"info","source":"bbs","message":"bbs.starting","data":{}}
Failed 'curl --fail --silent http://0.0.0.0:8890/ping' on attempt 1
{"timestamp":"2020-10-15T08:45:29.086871916Z","level":"fatal","source":"bbs","message":"bbs.sql-failed-to-connect","data":{"error":"dial tcp 10.100.200.113:3306: connect: connection refused","trace":"goroutine 1 [running]:\ncode.cloudfoundry.org/lager.(*logger).Fatal(0xc00025e180, 0xe8cd3f, 0x15, 0xff7180, 0xc0003e8000, 0x0, 0x0, 0x0)\n\t/var/vcap/source/bbs/src/code.cloudfoundry.org/lager/logger.go:138 +0xc6\nmain.main()\n\t/var/vcap/source/bbs/src/code.cloudfoundry.org/bbs/cmd/bbs/main.go:140 +0x49f6\n"}}
Failed 'curl --fail --silent http://0.0.0.0:8890/ping' on attempt 2
panic: dial tcp 10.100.200.113:3306: connect: connection refused
goroutine 1 [running]: code.cloudfoundry.org/lager.(*logger).Fatal(0xc00025e180, 0xe8cd3f, 0x15, 0xff7180, 0xc0003e8000, 0x0, 0x0, 0x0) /var/vcap/source/bbs/src/code.cloudfoundry.org/lager/logger.go:162 +0x582 main.main() /var/vcap/source/bbs/src/code.cloudfoundry.org/bbs/cmd/bbs/main.go:140 +0x49f6
Getting logs for database pod:
$ kubectl logs database-0 -n kubecf -c database I AM database-0 - 10.200.67.13 Warning: resolveip is deprecated and will be removed in a future version. resolveip: Unable to find hostid for 'database-repl': host not found I am the Primary Node Removing pending files in /var/lib/mysql, because sentinel was not reached Running --initialize-insecure on /var/lib/mysql total 8.0K drwxrws--x 2 mysql mysql 6.0K Oct 15 09:04 . drwxr-xr-x 1 root root 4.0K Feb 25 2020 .. Finished --initialize-insecure MySQL init process in progress... MySQL init process in progress... MySQL init process in progress... MySQL init process failed.
@lordcf it doesn't seem like something we can debug on our end - because you're using a unique setup. The error you're getting is pointing to some dns issues, have you checked that?
Describe the bug I am trying to deploy KubeCF (v2.2.2) and getting issues in diego-cell pod. The garden container is getting failed. Below are the logs for the garden container:
{"timestamp":"1594644900.966236353","source":"grootfs","message":"grootfs.init-store.store-manager-init-store.existing-backing-store-could-not-be-mounted: Mounting filesystem: exit status 32: mount: /var/vcap/data/grootfs/store/unprivileged: wrong fs type, bad option, bad superblock on /dev/loop0, missing codepage or helper program, or other error.\n","log_level":1,"data":{"session":"1.1","spec":{"UIDMappings":[{"HostID":4294967294,"NamespaceID":0,"Size":1},{"HostID":1,"NamespaceID":1,"Size":4294967293}],"GIDMappings":[{"HostID":4294967294,"NamespaceID":0,"Size":1},{"HostID":1,"NamespaceID":1,"Size":4294967293}],"StoreSizeBytes":9223372020747599872},"storePath":"/var/vcap/data/grootfs/store/unprivileged"}} {"timestamp":"1594644900.969948530","source":"grootfs","message":"grootfs.init-store.store-manager-init-store.truncating-backing-store-file-failed","log_level":2,"data":{"backingstoreFile":"/var/vcap/data/grootfs/store/unprivileged.backing-store","error":"truncate /var/vcap/data/grootfs/store/unprivileged.backing-store: file too large","session":"1.1","size":9223372020747599872,"spec":{"UIDMappings":[{"HostID":4294967294,"NamespaceID":0,"Size":1},{"HostID":1,"NamespaceID":1,"Size":4294967293}],"GIDMappings":[{"HostID":4294967294,"NamespaceID":0,"Size":1},{"HostID":1,"NamespaceID":1,"Size":4294967293}],"StoreSizeBytes":9223372020747599872},"storePath":"/var/vcap/data/grootfs/store/unprivileged"}} {"timestamp":"1594644900.970129967","source":"grootfs","message":"grootfs.init-store.init-store-failed","log_level":2,"data":{"error":"truncating backing store file: truncate /var/vcap/data/grootfs/store/unprivileged.backing-store: file too large","session":"1"}} truncate /var/vcap/data/grootfs/store/unprivileged.backing-store: file too large
To Reproduce Trying to deploy KubeCF on the cluster deployed on BOSH ( aws env). The cluster is having enabled container privileges.