minishift / minishift-centos-iso

CentOS based ISO as an alternative for boot2docker ISO
GNU Lesser General Public License v3.0
40 stars 33 forks source link

Failing tests on CentOS-CI + KVM #121

Open gbraad opened 7 years ago

gbraad commented 7 years ago

This is a general issue, and I have seen this happen from time to time with the CentOS image:

Error: E0720 04:12:09.674403   23446 start.go:278] Error starting the VM: Error configuring authorization on host: Maximum number of retries (60) exceeded. Retrying.
E0720 04:15:49.210763   23446 start.go:278] Error starting the VM: Error configuring authorization on host: Maximum number of retries (60) exceeded. Retrying.
E0720 04:19:29.286027   23446 start.go:278] Error starting the VM: Error configuring authorization on host: Maximum number of retries (60) exceeded. Retrying.
Error starting the VM: Error configuring authorization on host: Maximum number of retries (60) exceeded
Error configuring authorization on host: Maximum number of retries (60) exceeded
Error configuring authorization on host: Maximum number of retries (60) exceeded

It is unclear why this happens...

gbraad commented 7 years ago

Just a few ...

gbraad commented 7 years ago

Added a sleep statement for the minishift start, as the error in this case is consistently happening at the swapspace test

gbraad commented 7 years ago

Added sleep (as in #122)... but still failing:

gbraad commented 7 years ago

Issue still occasionally occurs as can be seen in the following build for different PRs:

LalatenduMohanty commented 7 years ago

@gbraad Just to confirm, you have seen these kind of error with local runs too?

gbraad commented 7 years ago

less often, but yes. Although I was in a hurry to test, so didn't pay a lot of attention.

On Tue, Jul 25, 2017 at 7:53 PM, Lalatendu Mohanty <notifications@github.com

wrote:

@gbraad https://github.com/gbraad Just to confirm, you have seen these kind of error with local runs too?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/minishift/minishift-centos-iso/issues/121#issuecomment-317714334, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAHZtHnZ2RkATvHJaARECXeOwJI36cLks5sRdcfgaJpZM4Odlv_ .

--

Gerard Braad | http://gbraad.nl [ Doing Open Source Matters ]

coolbrg commented 7 years ago

I am able to reproduce the error locally as well for even starting the VM:

Executing command : /home/budhram/redhat/minishift-centos-iso/tests/../build/bin/minishift start --vm-driver kvm --iso-url file:///home/budhram/redhat/minishift-centos-iso/tests/../build/minishift-centos7.iso
*******
2017-08-22 19:33:10,018 test             L0119 DEBUG| Error: E0822 19:26:46.940451   12265 start.go:342] Error starting the VM: Error creating the VM. Error creating machine: Error detecting OS: Too many retries waiting for SSH to be available.  Last error: Maximum number of retries (60) exceeded. Retrying.
E0822 19:29:58.276836   12265 start.go:342] Error starting the VM: Error configuring authorization on host: Too many retries waiting for SSH to be available.  Last error: Maximum number of retries (60) exceeded. Retrying.
E0822 19:33:10.016776   12265 start.go:342] Error starting the VM: Error configuring authorization on host: Too many retries waiting for SSH to be available.  Last error: Maximum number of retries (60) exceeded. Retrying.
Error starting the VM: Error creating the VM. Error creating machine: Error detecting OS: Too many retries waiting for SSH to be available.  Last error: Maximum number of retries (60) exceeded
Error configuring authorization on host: Too many retries waiting for SSH to be available.  Last error: Maximum number of retries (60) exceeded
Error configuring authorization on host: Too many retries waiting for SSH to be available.  Last error: Maximum number of retries (60) exceeded
...
gbraad commented 7 years ago

add --show-libmachine-logs ;-) and let's see what happens.... This error happens when there is an issue with the provisioning not able to start... which can be related to the IP address...

is it a lease issue?

On Tue, Aug 22, 2017 at 10:05 PM, Budh Ram Gurung notifications@github.com wrote:

I am able to reproduce the error locally as well for even starting the VM:

Executing command : /home/budhram/redhat/minishift-centos-iso/tests/../build/bin/minishift start --vm-driver kvm --iso-url file:///home/budhram/redhat/minishift-centos-iso/tests/../build/minishift-centos7.iso


2017-08-22 19:33:10,018 test L0119 DEBUG| Error: E0822 19:26:46.940451 12265 start.go:342] Error starting the VM: Error creating the VM. Error creating machine: Error detecting OS: Too many retries waiting for SSH to be available. Last error: Maximum number of retries (60) exceeded. Retrying. E0822 19:29:58.276836 12265 start.go:342] Error starting the VM: Error configuring authorization on host: Too many retries waiting for SSH to be available. Last error: Maximum number of retries (60) exceeded. Retrying. E0822 19:33:10.016776 12265 start.go:342] Error starting the VM: Error configuring authorization on host: Too many retries waiting for SSH to be available. Last error: Maximum number of retries (60) exceeded. Retrying. Error starting the VM: Error creating the VM. Error creating machine: Error detecting OS: Too many retries waiting for SSH to be available. Last error: Maximum number of retries (60) exceeded Error configuring authorization on host: Too many retries waiting for SSH to be available. Last error: Maximum number of retries (60) exceeded Error configuring authorization on host: Too many retries waiting for SSH to be available. Last error: Maximum number of retries (60) exceeded ...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/minishift/minishift-centos-iso/issues/121#issuecomment-324036884, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAHZrYIRhIMv3XEkYlfkgt99iQzGwlyks5sauAggaJpZM4Odlv_ .

--

Gerard Braad | http://gbraad.nl [ Doing Open Source Matters ]

coolbrg commented 7 years ago

@gbraad could you add those debugs lines here https://github.com/gbraad/minishift-centos-iso/blob/opt/tests/test.py#L90 then we can see why it is failing in CI.

I am also trying locally.

The upstream released latest ISO passed locally.

coolbrg commented 7 years ago

Logs

Logs with upstream releasted latest centos ISO

(minishift) DBG | Starting VM minishift
(minishift) DBG | GetIP called for minishift
(minishift) DBG | Failed to retrieve dnsmasq leases from /var/lib/libvirt/dnsmasq/docker-machines.leases
(minishift) DBG | Unable to locate IP address for MAC 52:54:00:4f:77:aa
(minishift) DBG | Waiting for the VM to come up... 0
[...]
(minishift) DBG | GetIP called for minishift
(minishift) DBG | Failed to retrieve dnsmasq leases from /var/lib/libvirt/dnsmasq/docker-machines.leases
(minishift) DBG | Unable to locate IP address for MAC 52:54:00:4f:77:aa
(minishift) DBG | Waiting for the VM to come up... 14
(minishift) DBG | GetIP called for minishift
(minishift) DBG | Failed to retrieve dnsmasq leases from /var/lib/libvirt/dnsmasq/docker-machines.leases
(minishift) DBG | IP address: 192.168.42.87
(minishift) DBG | Unable to locate IP address for MAC 52:54:00:4f:77:aa
(minishift) Calling .GetConfigRaw
(minishift) Calling .DriverName
(minishift) Calling .DriverName
(minishift) Calling .GetState
(minishift) DBG | Getting current state...
Getting to WaitForSSH function...
(minishift) Calling .GetSSHHostname
(minishift) DBG | GetIP called for minishift
(minishift) DBG | Failed to retrieve dnsmasq leases from /var/lib/libvirt/dnsmasq/docker-machines.leases
(minishift) DBG | IP address: 192.168.42.87
(minishift) DBG | Unable to locate IP address for MAC 52:54:00:4f:77:aa
(minishift) Calling .GetSSHPort
(minishift) Calling .GetSSHKeyPath
(minishift) Calling .GetSSHKeyPath
(minishift) DBG | AK: resolvestorepath: /home/budhram/.minishift
(minishift) DBG | AK: resolvestorepath: /home/budhram/.minishift
(minishift) Calling .GetSSHUsername
Using SSH client type: external
Using SSH private key: /home/budhram/.minishift/machines/minishift/id_rsa (-rw-------)
&{[-F /dev/null -o PasswordAuthentication=no -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=quiet -o ConnectionAttempts=3 -o ConnectTimeout=10 -o ControlMaster=no -o ControlPath=none docker@192.168.42.87 -o IdentitiesOnly=yes -i /home/budhram/.minishift/machines/minishift/id_rsa -p 22] /usr/bin/ssh <nil>}
About to run SSH command:
exit 0
SSH cmd err, output: exit status 255: 
Error getting ssh command 'exit 0' : ssh command error:
command : exit 0
err     : exit status 255
output  : 
Getting to WaitForSSH function...
(minishift) Calling .GetSSHHostname
(minishift) DBG | GetIP called for minishift
(minishift) DBG | Failed to retrieve dnsmasq leases from /var/lib/libvirt/dnsmasq/docker-machines.leases
(minishift) DBG | IP address: 192.168.42.87
(minishift) Calling .GetSSHPort
(minishift) DBG | Unable to locate IP address for MAC 52:54:00:4f:77:aa
(minishift) Calling .GetSSHKeyPath
(minishift) DBG | AK: resolvestorepath: /home/budhram/.minishift
(minishift) Calling .GetSSHKeyPath
(minishift) Calling .GetSSHUsername
(minishift) DBG | AK: resolvestorepath: /home/budhram/.minishift
Using SSH client type: external
Using SSH private key: /home/budhram/.minishift/machines/minishift/id_rsa (-rw-------)
&{[-F /dev/null -o PasswordAuthentication=no -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=quiet -o ConnectionAttempts=3 -o ConnectTimeout=10 -o ControlMaster=no -o ControlPath=none docker@192.168.42.87 -o IdentitiesOnly=yes -i /home/budhram/.minishift/machines/minishift/id_rsa -p 22] /usr/bin/ssh <nil>}
About to run SSH command:
exit 0
SSH cmd err, output: <nil>: 
(minishift) Calling .GetSSHHostname
(minishift) DBG | GetIP called for minishift
(minishift) DBG | Failed to retrieve dnsmasq leases from /var/lib/libvirt/dnsmasq/docker-machines.leases
(minishift) DBG | IP address: 192.168.42.87
(minishift) DBG | Unable to locate IP address for MAC 52:54:00:4f:77:aa
(minishift) Calling .GetSSHPort
(minishift) Calling .GetSSHKeyPath
(minishift) Calling .GetSSHKeyPath
(minishift) DBG | AK: resolvestorepath: /home/budhram/.minishift
(minishift) Calling .GetSSHUsername
(minishift) DBG | AK: resolvestorepath: /home/budhram/.minishift
Using SSH client type: external
Using SSH private key: /home/budhram/.minishift/machines/minishift/id_rsa (-rw-------)
&{[-F /dev/null -o PasswordAuthentication=no -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=quiet -o ConnectionAttempts=3 -o ConnectTimeout=10 -o ControlMaster=no -o ControlPath=none docker@192.168.42.87 -o IdentitiesOnly=yes -i /home/budhram/.minishift/machines/minishift/id_rsa -p 22] /usr/bin/ssh <nil>}
About to run SSH command:
cat /etc/os-release
SSH cmd err, output: <nil>: NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

VARIANT="minishift"
VARIANT_VERSION="1.1.0"
BUILD_ID="ecb5ada-25072017180632-local"
  ...

Logs with Locally built ISO or affected ISO

(minishift) DBG | Starting VM minishift
(minishift) DBG | GetIP called for minishift
(minishift) DBG | Failed to retrieve dnsmasq leases from /var/lib/libvirt/dnsmasq/docker-machines.leases
(minishift) DBG | Unable to locate IP address for MAC 52:54:00:6b:3e:7e
(minishift) DBG | Waiting for the VM to come up... 0
[...]
(minishift) DBG | Failed to retrieve dnsmasq leases from /var/lib/libvirt/dnsmasq/docker-machines.leases
(minishift) DBG | Unable to locate IP address for MAC 52:54:00:6b:3e:7e
(minishift) DBG | Waiting for the VM to come up... 14
(minishift) DBG | GetIP called for minishift
(minishift) DBG | Failed to retrieve dnsmasq leases from /var/lib/libvirt/dnsmasq/docker-machines.leases
(minishift) DBG | IP address: 192.168.42.53
(minishift) DBG | Unable to locate IP address for MAC 52:54:00:6b:3e:7e
(minishift) Calling .GetConfigRaw
(minishift) Calling .DriverName
(minishift) Calling .DriverName
(minishift) Calling .GetState
(minishift) DBG | Getting current state...
Getting to WaitForSSH function...
(minishift) Calling .GetSSHHostname
(minishift) DBG | GetIP called for minishift
(minishift) DBG | Failed to retrieve dnsmasq leases from /var/lib/libvirt/dnsmasq/docker-machines.leases
(minishift) DBG | IP address: 192.168.42.53
(minishift) DBG | Unable to locate IP address for MAC 52:54:00:6b:3e:7e
(minishift) Calling .GetSSHPort
(minishift) Calling .GetSSHKeyPath
(minishift) Calling .GetSSHKeyPath
(minishift) DBG | AK: resolvestorepath: /home/budhram/.minishift
(minishift) DBG | AK: resolvestorepath: /home/budhram/.minishift
(minishift) Calling .GetSSHUsername
Using SSH client type: external
Using SSH private key: /home/budhram/.minishift/machines/minishift/id_rsa (-rw-------)
&{[-F /dev/null -o PasswordAuthentication=no -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=quiet -o ConnectionAttempts=3 -o ConnectTimeout=10 -o ControlMaster=no -o ControlPath=none docker@192.168.42.53 -o IdentitiesOnly=yes -i /home/budhram/.minishift/machines/minishift/id_rsa -p 22] /usr/bin/ssh <nil>}
About to run SSH command:
exit 0
SSH cmd err, output: exit status 255: 
Error getting ssh command 'exit 0' : ssh command error:
command : exit 0
err     : exit status 255
output  : 
Getting to WaitForSSH function...
(minishift) Calling .GetSSHHostname
(minishift) DBG | GetIP called for minishift
(minishift) DBG | Failed to retrieve dnsmasq leases from /var/lib/libvirt/dnsmasq/docker-machines.leases
(minishift) DBG | IP address: 192.168.42.53
(minishift) DBG | Unable to locate IP address for MAC 52:54:00:6b:3e:7e
(minishift) Calling .GetSSHPort
(minishift) Calling .GetSSHKeyPath
(minishift) Calling .GetSSHKeyPath
(minishift) DBG | AK: resolvestorepath: /home/budhram/.minishift
(minishift) DBG | AK: resolvestorepath: /home/budhram/.minishift
(minishift) Calling .GetSSHUsername
Using SSH client type: external
Using SSH private key: /home/budhram/.minishift/machines/minishift/id_rsa (-rw-------)
&{[-F /dev/null -o PasswordAuthentication=no -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=quiet -o ConnectionAttempts=3 -o ConnectTimeout=10 -o ControlMaster=no -o ControlPath=none docker@192.168.42.53 -o IdentitiesOnly=yes -i /home/budhram/.minishift/machines/minishift/id_rsa -p 22] /usr/bin/ssh <nil>}
About to run SSH command:
exit 0
SSH cmd err, output: exit status 1: /bin/bash: Permission denied

Error getting ssh command 'exit 0' : ssh command error:
command : exit 0
err     : exit status 1
output  : /bin/bash: Permission denied

Getting to WaitForSSH function...
(minishift) Calling .GetSSHHostname
(minishift) DBG | GetIP called for minishift
(minishift) DBG | Failed to retrieve dnsmasq leases from /var/lib/libvirt/dnsmasq/docker-machines.leases
(minishift) DBG | IP address: 192.168.42.53
(minishift) DBG | Unable to locate IP address for MAC 52:54:00:6b:3e:7e
(minishift) Calling .GetSSHPort
(minishift) Calling .GetSSHKeyPath
(minishift) Calling .GetSSHKeyPath
(minishift) DBG | AK: resolvestorepath: /home/budhram/.minishift
(minishift) Calling .GetSSHUsername
(minishift) DBG | AK: resolvestorepath: /home/budhram/.minishift
Using SSH client type: external
Using SSH private key: /home/budhram/.minishift/machines/minishift/id_rsa (-rw-------)
&{[-F /dev/null -o PasswordAuthentication=no -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=quiet -o ConnectionAttempts=3 -o ConnectTimeout=10 -o ControlMaster=no -o ControlPath=none docker@192.168.42.53 -o IdentitiesOnly=yes -i /home/budhram/.minishift/machines/minishift/id_rsa -p 22] /usr/bin/ssh <nil>}
About to run SSH command:
exit 0
************************************** This get repeated *******************************************
SSH cmd err, output: exit status 1: /bin/bash: Permission denied

Error getting ssh command 'exit 0' : ssh command error:
command : exit 0
err     : exit status 1
output  : /bin/bash: Permission denied
************************************** This get repeated *******************************************
gbraad commented 7 years ago

Created https://github.com/minishift/minishift/issues/1301

coolbrg commented 7 years ago

This issue is not reproducible in my local centos box.

hferentschik commented 7 years ago

So any outcome so far? Maybe Brian has some insight? Are we exceeding and resource limits or violate any other constraints placed on the build slave?

coolbrg commented 7 years ago

So any outcome so far? Maybe Brian has some insight?

I will follow up this with Brian now as we are really running out of options here.

In my local CentOS box, it is not reproducible at all even if I run 10 times. In CI, it is failing and passing once or twice in ten occasions. See here https://ci.centos.org/job/minishift-centos-iso-pr/

gbraad commented 7 years ago

Same here. Not reproducible for the same problem.

coolbrg commented 7 years ago

More updates on debugging:

  1. Tried with Released centos is 1.1.0, make test failing inside build machine
  2. Tried with lower version of minishift 1.4.1, make test still failing inside build machine
  3. Tried with removing few test cases and added sleep of 5 sec after stopping the container, make test passed first time, retrying again.
gbraad commented 7 years ago

I did add a sleep before and did not change the results... hope this is different for you

coolbrg commented 7 years ago

I did add a sleep before and did not change the results... hope this is different for you

Not sure how it passed but lost control of machine while trying second time.

coolbrg commented 7 years ago

Logs of successful restart - https://gist.github.com/budhrg/26d534e029affa04a15f06e4b8b876a4

coolbrg commented 7 years ago

Logs of unsuccessful restart - https://gist.github.com/budhrg/1bdc7444d9796082c212f0b165286bfd

LalatenduMohanty commented 6 years ago

A sleep of 90s added after the minishift start to work around the issue [1] In case we choose to close this issue. We will create a separate issue to findout why we need a sleep and how we can remove it.

[1] https://github.com/minishift/minishift-centos-iso/blob/master/tests/test.sh#L129