Closed coolbrg closed 7 years ago
I am able to reproduce this issue but still not sure why it happen. Just to check sccli script behavior I added a flag which return exit value and that seemed as expected. behind the scene sccli is running system command which use subprocess
module with communicate
and as per doc that suppose to wait till process is complete and return tupple https://docs.python.org/2/library/subprocess.html#subprocess.Popen.communicate .
==> default: Running provisioner: shell...
default: Running: inline script
==> default: 0
==> default: Running provisioner: shell...
default: Running: inline script
Here we can see return value is 0
from the inline script. Does this is a blocker?
@praveenkumar Why it is not the case with CDK 2.1 ? What are the changes in CDK 2.2 around it?
What are the changes in CDK 2.2 around it?
Only change we have is using sccli
instead of systemctl
for openshift service in the Vagrantfile and we did this because we need to pass shell variable before start service now, like proxy stuff and OSE specific version.
@budhrg BTW how do you folks do same testing with ADB because for that it always the sccli in the Vagrantfile?
@praveenkumar We now have added sleep of 10 seconds to pass our CI . See here https://github.com/projectatomic/vagrant-service-manager/blob/master/features/adb-openshift.feature#L36
Locally I need to give 20 secs sometime.
Only change we have is using sccli instead of systemctl for openshift service in the Vagrantfile and we did this because we need to pass shell variable before start service now, like proxy stuff and OSE specific version.
Then, it is sccli
issue I guess that it is not behaving similar to systemctl as you mentioned in your comment above.
Another finding during meeting is service goes to activating
state for some time before it run.
[root@rhel-cdk vagrant]# sccli openshift start
[root@rhel-cdk vagrant]# systemctl is-active openshift
activating
[root@rhel-cdk vagrant]# echo $?
3 => activating state return code.
[root@rhel-cdk vagrant]# systemctl status openshift
● openshift.service - Docker Application Container for OpenShift
Loaded: loaded (/usr/lib/systemd/system/openshift.service; disabled; vendor preset: disabled)
Active: activating (start-post) since Wed 2016-10-26 05:47:32 EDT; 2s ago
Docs: https://docs.openshift.org/
Process: 16966 ExecStop=/usr/bin/sh -c /opt/adb/openshift/openshift_stop (code=exited, status=0/SUCCESS)
Process: 17099 ExecStartPre=/usr/bin/docker rm openshift (code=exited, status=0/SUCCESS)
Process: 17092 ExecStartPre=/usr/bin/docker stop openshift (code=exited, status=0/SUCCESS)
Main PID: 17106 (sh); : 17107 (sh)
Memory: 8.5M
CGroup: /system.slice/openshift.service
├─17106 /usr/bin/sh /opt/adb/openshift/openshift
├─17143 /usr/bin/docker-current run --name openshift --privileged --net=host --pid=host -w /var/lib/openshift -e KUBECONFIG=/var/lib/openshift/openshift.local....
└─control
├─17107 /usr/bin/sh /opt/adb/openshift/openshift_provision
└─17182 sleep 1
[root@rhel-cdk vagrant]# systemctl is-active openshift
active
With CDK 2.2:
So using CDK 2.2, the Vagrant configuration from adb-atomic-developer-bundle and running:
I get:
OpenShift service is reported to be
stopped
. Thenrunning
a few seconds later:So the OpenShift provisioning seems to return too early. This is consistent with the behavior we see in the tests.
With CDK 2.1 :
Also observed, CDK 2.2 with
systemctl start openshift
provisioner working fine: