SUSE / DeepSea

A collection of Salt files for deploying, managing and automating Ceph.
GNU General Public License v3.0
161 stars 75 forks source link

osd: forcefully mark an osd as down after stopping the service #1838

Closed jschmid1 closed 4 years ago

jschmid1 commented 4 years ago

Signed-off-by: Joshua Schmid jschmid@suse.de

Fixes bsc#1171451

Description:

We seem to have an issue with ceph 14.2.8+ and osd replacements/removals where the step in osd.remove/replace that is supposed to destroy/purge the osd from the crushmap bails out when the osd is not in the expected 'down' state. This should happen automatically after the osd daemon has stopped. Apparently there's a timing issue somewhere. An easy and non-intrusive workaround is to manually force the osd down after checks have passed.


Checklist:

jschmid1 commented 4 years ago

@susebot run teuthology

jschmid1 commented 4 years ago

@shyukri the make-test seems to fail with

15:00:47 Step 12/15 : RUN echo %_topdir $HOME/rpmbuild | tee $HOME/.rpmmacros
15:00:47  ---> Running in d9925191f871
15:00:49 tee: /home/jenkins/.rpmmacros: No space left on device
susebot commented 4 years ago

Commit 9943d9f9ee7593e8a52f83407b484bb578da6385 is NOT OK for suite deepsea:tier2. Check tests results in the Jenkins job: http://storage-ci.suse.de/job/pr-deepsea/445/

kshtsk commented 4 years ago

retest this please

kshtsk commented 4 years ago

@shyukri the make-test seems to fail with

15:00:47 Step 12/15 : RUN echo %_topdir $HOME/rpmbuild | tee $HOME/.rpmmacros
15:00:47  ---> Running in d9925191f871
15:00:49 tee: /home/jenkins/.rpmmacros: No space left on device

someone or something enabled worker-0 which is not supposed to be enabled at all, or it has added an rpm builder label to it...

smithfarm commented 4 years ago

Commit 9943d9f is NOT OK for suite deepsea:tier2. Check tests results in the Jenkins job: http://storage-ci.suse.de/job/pr-deepsea/445/

This is a testing infrastructure issue:

2020-05-11T13:00:44.954 INFO:teuthology.orchestra.run.target-ses-015.stderr:Repository 'development_tools_update' is invalid.
2020-05-11T13:00:44.954 INFO:teuthology.orchestra.run.target-ses-015.stderr:[development_tools_update|http://10.86.0.120/artifacts/ci/snapshot/202004281539/735743cb130e8186e1f8ee215ec164e4fcbc4a212b52cc112d14a83c6be91d7d/SLE-15-SP1-Development-Tools-update] Valid metadata not found at specified URL
2020-05-11T13:00:44.954 INFO:teuthology.orchestra.run.target-ses-015.stderr:History:
2020-05-11T13:00:44.955 INFO:teuthology.orchestra.run.target-ses-015.stderr: - [development_tools_update|http://10.86.0.120/artifacts/ci/snapshot/202004281539/735743cb130e8186e1f8ee215ec164e4fcbc4a212b52cc112d14a83c6be91d7d/SLE-15-SP1-Development-Tools-update] Repository type can't be determined.
2020-05-11T13:00:44.955 INFO:teuthology.orchestra.run.target-ses-015.stderr:
2020-05-11T13:00:44.956 INFO:teuthology.orchestra.run.target-ses-015.stderr:Please check if the URIs defined for this repository are pointing to a valid repository.
...
2020-05-11T13:00:45.701 INFO:teuthology.orchestra.run.target-ses-015.stderr:Repository 'server_update' is invalid.
2020-05-11T13:00:45.702 INFO:teuthology.orchestra.run.target-ses-015.stderr:[server_update|http://10.86.0.120/artifacts/ci/snapshot/202004281539/5e324ec9127e9f090f369bd3a9cfcb68272aafa2268498fd4999069e823b872a/SLE-15-SP1-Server-Applications-update] Valid metadata not found at specified URL
2020-05-11T13:00:45.702 INFO:teuthology.orchestra.run.target-ses-015.stderr:History:
2020-05-11T13:00:45.702 INFO:teuthology.orchestra.run.target-ses-015.stderr: - [server_update|http://10.86.0.120/artifacts/ci/snapshot/202004281539/5e324ec9127e9f090f369bd3a9cfcb68272aafa2268498fd4999069e823b872a/SLE-15-SP1-Server-Applications-update] Repository type can't be determined.
2020-05-11T13:00:45.703 INFO:teuthology.orchestra.run.target-ses-015.stderr:
2020-05-11T13:00:45.703 INFO:teuthology.orchestra.run.target-ses-015.stderr:Please check if the URIs defined for this repository are pointing to a valid repository.
kshtsk commented 4 years ago

Commit 9943d9f is NOT OK for suite deepsea:tier2. Check tests results in the Jenkins job: http://storage-ci.suse.de/job/pr-deepsea/445/

This is a testing infrastructure issue:

2020-05-11T13:00:44.954 INFO:teuthology.orchestra.run.target-ses-015.stderr:Repository 'development_tools_update' is invalid.
2020-05-11T13:00:44.954 INFO:teuthology.orchestra.run.target-ses-015.stderr:[development_tools_update|http://10.86.0.120/artifacts/ci/snapshot/202004281539/735743cb130e8186e1f8ee215ec164e4fcbc4a212b52cc112d14a83c6be91d7d/SLE-15-SP1-Development-Tools-update] Valid metadata not found at specified URL
2020-05-11T13:00:44.954 INFO:teuthology.orchestra.run.target-ses-015.stderr:History:
2020-05-11T13:00:44.955 INFO:teuthology.orchestra.run.target-ses-015.stderr: - [development_tools_update|http://10.86.0.120/artifacts/ci/snapshot/202004281539/735743cb130e8186e1f8ee215ec164e4fcbc4a212b52cc112d14a83c6be91d7d/SLE-15-SP1-Development-Tools-update] Repository type can't be determined.
2020-05-11T13:00:44.955 INFO:teuthology.orchestra.run.target-ses-015.stderr:
2020-05-11T13:00:44.956 INFO:teuthology.orchestra.run.target-ses-015.stderr:Please check if the URIs defined for this repository are pointing to a valid repository.
...
2020-05-11T13:00:45.701 INFO:teuthology.orchestra.run.target-ses-015.stderr:Repository 'server_update' is invalid.
2020-05-11T13:00:45.702 INFO:teuthology.orchestra.run.target-ses-015.stderr:[server_update|http://10.86.0.120/artifacts/ci/snapshot/202004281539/5e324ec9127e9f090f369bd3a9cfcb68272aafa2268498fd4999069e823b872a/SLE-15-SP1-Server-Applications-update] Valid metadata not found at specified URL
2020-05-11T13:00:45.702 INFO:teuthology.orchestra.run.target-ses-015.stderr:History:
2020-05-11T13:00:45.702 INFO:teuthology.orchestra.run.target-ses-015.stderr: - [server_update|http://10.86.0.120/artifacts/ci/snapshot/202004281539/5e324ec9127e9f090f369bd3a9cfcb68272aafa2268498fd4999069e823b872a/SLE-15-SP1-Server-Applications-update] Repository type can't be determined.
2020-05-11T13:00:45.703 INFO:teuthology.orchestra.run.target-ses-015.stderr:
2020-05-11T13:00:45.703 INFO:teuthology.orchestra.run.target-ses-015.stderr:Please check if the URIs defined for this repository are pointing to a valid repository.

The published artifacts should be updated regularly since they are cleaned now.