Open hferentschik opened 8 years ago
@hferentschik I have updated the test to successfully run
bundle exec vagrant service-manager install-cli openshift --cli-version 1.3.0 --path #{ENV['VAGRANT_HOME']}/oc
as I found it is related to minimum memory requirement for Openshift #420 .
However, there is one test case which still need to be investigated.
When I run `bundle exec vagrant service-manager install-cli openshift`
We can keep this issue open till we found the reason of the above test case failure. Rest is fine and genuinely working in CI as part of my investigation.
We also found that there is some wait time (~20s in dev machine) required in between vagrant up
and first operation related to openshift like vagrant service-manager status openshift
.
We need to investigate the root cause for this delay requirement.
Even the CDK OpenShift test fails after upgrading to the latest CDK box.
For the record, the initial problem was that the tests were not configured to run against CDK. Hence, it looked like they were running and passing, but in reality they got skipped. This was hidden by issue #419. We were using the default pretty formatter without coloring which did not indicate which tests got run and which test got skipped.
It seems there is a regression in the OpenShift service startup which leads to the fact that the OpenShift status is not directly "running" after a successful 'vagrant up'.
So using CDK 2.2, the Vagrant configuration from adb-atomic-developer-bundle and running:
vagrant up; vagrant service-manager status
I get:
==> default: Docker service configured successfully...
==> default: OpenShift service configured successfully...
==> default: Mounting SSHFS shared folder...
==> default: Mounting folder via SSHFS: /Users/hardy => /Users/hardy
==> default: Checking Mount..
==> default: Folder Successfully Mounted!
==> default: Running provisioner: shell...
default: Running: inline script
==> default: Running provisioner: shell...
default: Running: inline script
==> default:
==> default: Successfully started and provisioned VM with 2 cores and 3072 MB of memory.
==> default: To modify the number of cores and/or available memory set the environment variables
==> default: VM_CPU and/or VM_MEMORY respectively.
==> default:
==> default: You can now access the OpenShift console on: https://10.1.2.2:8443/console
==> default: To use OpenShift CLI, run:
==> default: $ vagrant ssh
==> default: $ oc login
==> default:
==> default: Configured users are (<username>/<password>):
==> default: openshift-dev/devel
==> default: admin/admin
==> default:
==> default: If you have the oc client library on your host, you can also login from your host.
Configured services:
docker - running
openshift - stopped
kubernetes - stopped
OpenShift service is reported to be running. Then running a few seconds later:
$ vagrant service-manager status
Configured services:
docker - running
openshift - running
kubernetes - stopped
So the OpenShift provisioning seems to return too early. This is consistent with the behavior we see in the tests.
It does not matter whether I use:
config.vm.provision "shell", run: "always", inline: <<-SHELL
PROXY=#{PROXY} PROXY_USER=#{PROXY_USER} PROXY_PASSWORD=#{PROXY_PASSWORD} /usr/bin/sccli openshift
SHELL
or just
config.servicemanager.services = "openshift"
So the OpenShift provisioning seems to return too early. This is consistent with the behavior we see in the tests.
And this is only seen with CDK 2.2 but not CDK 2.1.
AFAICT yes
CDK 2.1 behavior:
$ vagrant up; vagrant service-manager status
Bringing machine 'default' up with 'virtualbox' provider...
==> default: Importing base box 'cdkv2.1'...
...
==> default: Copying TLS certificates to /home/budhram/redhat/vagrant-service-manager/.vagrant/machines/default/virtualbox/docker
==> default: Docker service configured successfully...
==> default: OpenShift service configured successfully...
Configured services:
docker - running
openshift - running
kubernetes - stopped
CDK 2.2 with systemctl start openshift
provisioner:
$ vagrant up; vagrant service-manager status
Bringing machine 'default' up with 'virtualbox' provider...
==> default: Importing base box 'cdkv2'...
.......
==> default: Copying TLS certificates to /home/budhram/redhat/vagrant-service-manager/.vagrant/machines/default/virtualbox/docker
==> default: Docker service configured successfully...
==> default: OpenShift service configured successfully...
==> default: Running provisioner: shell...
default: Running: inline script
==> default: OpenShift started...
Configured services:
docker - running
openshift - running
kubernetes - stopped
## Vagrantfile
Vagrant.configure(2) do |config|
config.vm.box = 'cdkv2'
config.vm.network "private_network", ip: "10.10.10.42"
config.registration.skip = true
config.vm.provider('libvirt') { |v| v.memory = 3072 }
config.vm.provider('virtualbox') { |v| v.memory = 3072 }
config.vm.synced_folder '.', '/vagrant', disabled: true
# explicitly enable and start OpenShift
config.vm.provision "shell", run: "always", inline: <<-SHELL
systemctl start openshift
echo "OpenShift started..."
SHELL
end
CDK 2.2 with systemctl start openshift provisioner:
@budhrg so you are saying that with calling systemctl directly it works? In this case we are dealing with a sccli bug, right?
@budhrg If it works with systemctl we can change the Vagrant config in the respective feature files to use systemctl. At least this is better than a "random" sleep. We can add a comment to the issue in developer-bundle and update the tests once we have a fix there. WDYT?
@budhrg nice digging ;-)
Blocked on projectatomic/adb-utils#194
We don't have to be blocked, right? See https://github.com/projectatomic/vagrant-service-manager/issues/415#issuecomment-256027909
@budhrg If it works with systemctl we can change the Vagrant config in the respective feature files to use systemctl. At least this is better than a "random" sleep. We can add a comment to the issue in developer-bundle and update the tests once we have a fix there. WDYT?
But don't you think it is like diverting from actual behavior? I feel like we are doing some hack on our tests to make it pass :smile:
WDYT? @LalatenduMohanty
But don't you think it is like diverting from actual behavior? I feel like we are doing some hack on our tests to make it pass
This is for sure better than a sleep. Also the tests are about service-manager not about the VM. For our purposes we need a properly provisioned OpenShift. If we can get this vis systemctl so be it. I also rather do this and have the CDK tests running opposed to skipping them completely atm.
@hferentschik Somehow now I am not able to get systemctl start openshift
shell provisioner running now.
Even same reported by CI too https://ci.centos.org/job/vagrant-service-manager-budh/20/console Required changes I did are https://github.com/budhrg/vagrant-service-manager/commit/5e6a3721ed2ed2cb7c0f071511b0c022ba6497f1
Locally it is passing now sometimes.
Even tests are passing locally
➜ vagrant-service-manager-openshift-investigate git:(adb-openshift-investigate) ✗ be rake features FEATURE=features/cdk-openshift.feature PROVIDER=libvirt BOX=cdk
/Using existing public releaase CDK box (version v. 2.2.0 for x86_64) in /home/budhram/redhat/vagrant-service-manager-openshift-investigate/.boxes
|/home/budhram/.rvm/rubies/ruby-2.1.2/bin/ruby -S bundle exec cucumber features/cdk-openshift.feature
Using the default and html profiles...
.---------------------------------..................................
2 scenarios (1 skipped, 1 passed)
68 steps (33 skipped, 35 passed)
4m49.516s
Don't know whats happening in CI. :confused:
It seems it does not fail in cases where the tests are wrong, eg _When I evaluate and run
bundle exec vagrant service-manager install-cli openshift --cli-version 1.3.0 --path #{ENV['VAGRANT_HOME']}/oc
_ should fail, but did not.We can temporarily change the CI job to build one of our forks on which we can introduce some obvious test errors. We need to verify that this will result in test failures.