SUSE / DeepSea

A collection of Salt files for deploying, managing and automating Ceph.
GNU General Public License v3.0
160 stars 75 forks source link

Enable openstack integration tests #1322

Closed jschmid1 closed 6 years ago

jschmid1 commented 6 years ago

Description of Issue/Question

There's an effort to enable the openstack tests in #1316. That's the result

Openstack functests fail with:

Error ERANGE:  pg_num 128 size 2 would mean 840 total pgs, which exceeds max 800 (mon_max_pg_per_osd 200 * num_in_osds 4)

We need to tweak something in the tests/env to make that work out of the box

jschmid1 commented 6 years ago

Enable openstack integration tests

smithfarm commented 6 years ago

@tserong This issue arises because of

smithfarm@wilbur:~/src/DeepSea/srv/salt/ceph/openstack> ag 'pool create'
cinder-backup/pool/default.sls
4:    - name: "ceph osd pool create {{ prefix }}cloud-backups 128"

cinder/pool/default.sls
4:    - name: "ceph osd pool create {{ prefix }}cloud-volumes 128"

glance/pool/default.sls
4:    - name: "ceph osd pool create {{ prefix }}cloud-images 128"

nova/pool/default.sls
4:    - name: "ceph osd pool create {{ prefix }}cloud-vms 128"

In other words, the openstack orchestration is creating pools with a hard-coded number of PGs (128) which is inappropriate for the (small) cluster deployed by the CI.

The qa scripting already has a "pool pre-creation" mechanism which calculates the "right" number of PGs per pool based on the number of pools and the number of OSDs in the cluster. I can easily add an option to the qa deployment script to pre-create these four pools. However, this alone is not sufficient to fix the issue because the openattic functest orchestration deletes these pools at the beginning of the test: https://github.com/SUSE/DeepSea/blob/SES5/srv/salt/ceph/functests/1node/openstack/default.sls#L3-L6

(It deletes the pools at the end of the test, too, but that has little or no bearing on this issue since the CI always runs the tests in a fresh cluster which is destroyed at the end of the test.)

So, how do you suggest to address this issue? The only thing I can think of right now is to drop the "clean environment at start" bit. Combined with my tweak on the qa side, this would make the functest work in the CI, but it would mean that folks running the functest manually would have to take care to manually delete these pools, first, or risk running the tests on a "dirty" environment.

tserong commented 6 years ago

Ugh. I wonder if we should optionally parameterize the number of PGs instead? That'd possibly help real deployments too (where the admin wanted to have control over the number of PGs).

smithfarm commented 6 years ago

What does "parameterize" mean, though? Like have the SLS call a runner that reads a YAML file where the number of PGs could be optionally set?

smithfarm commented 6 years ago

All the other bits of DeepSea that create pools will refrain from doing so if the pool is already created (i.e. with the right number of PGs).

The only problem here is the functest is deleting the pool and re-creating it with the wrong number of PGs.

tserong commented 6 years ago

What does "parameterize" mean, though? Like have the SLS call a runner that reads a YAML file where the number of PGs could be optionally set?

Nope, just thinking to let the number of PGs be optionally passed in somehow as another command line param (the last thing I want to do is add more YAML :-)) But let's not do that either, I'll just get rid of the initial pool deletion.