SUSE / DeepSea

A collection of Salt files for deploying, managing and automating Ceph.
GNU General Public License v3.0
160 stars 75 forks source link

[SES5] remove.osd functest fails #1769

Closed smithfarm closed 4 years ago

smithfarm commented 4 years ago

This functest (remove.osd) is failing reproducibly in the SES5 CI (DeepSea 0.8.12 in SLE-12-SP3).

Relevant portion of the log DeepSea

2019-09-21T06:53:31.182 INFO:teuthology.orchestra.run.target-ses-017:> sudo salt-run deepsea.version 2>/dev/null
2019-09-21T06:53:32.415 INFO:teuthology.orchestra.run.target-ses-017.stdout:0.8.12+git.0.e885d513e

The functests fail like so:

2019-09-21T07:10:07.611 INFO:teuthology.orchestra.run.target-ses-017.stdout:  Name: refresh_pillar2 - Function: salt.state - Result: Changed Started: - 07:05:03.224302 Duration: 417.756 ms
2019-09-21T07:10:07.611 INFO:teuthology.orchestra.run.target-ses-017.stdout:  Name: Update Destroyed for 0 - Function: salt.state - Result: Changed Started: - 07:05:03.642521 Duration: 691.93 ms
2019-09-21T07:10:07.611 INFO:teuthology.orchestra.run.target-ses-017.stdout:  Name: Update Destroyed for 1 - Function: salt.state - Result: Changed Started: - 07:05:04.334859 Duration: 473.937 ms
2019-09-21T07:10:07.612 INFO:teuthology.orchestra.run.target-ses-017.stdout:  Name: disengage.safety - Function: salt.runner - Result: Changed Started: - 07:05:04.809127 Duration: 288.994 ms
2019-09-21T07:10:07.612 INFO:teuthology.orchestra.run.target-ses-017.stdout:  Name: remove.osd - Function: salt.runner - Result: Clean Started: - 07:05:05.098437 Duration: 80419.222 ms
2019-09-21T07:10:07.612 INFO:teuthology.orchestra.run.target-ses-017.stdout:----------
2019-09-21T07:10:07.612 INFO:teuthology.orchestra.run.target-ses-017.stdout:          ID: Check OSDs multiple
2019-09-21T07:10:07.613 INFO:teuthology.orchestra.run.target-ses-017.stdout:    Function: salt.state
2019-09-21T07:10:07.613 INFO:teuthology.orchestra.run.target-ses-017.stdout:      Result: False
2019-09-21T07:10:07.613 INFO:teuthology.orchestra.run.target-ses-017.stdout:     Comment: Run failed on minions: target-ses-017.ecp.suse.de
2019-09-21T07:10:07.613 INFO:teuthology.orchestra.run.target-ses-017.stdout:              Failures:
2019-09-21T07:10:07.614 INFO:teuthology.orchestra.run.target-ses-017.stdout:                  target-ses-017.ecp.suse.de:
2019-09-21T07:10:07.614 INFO:teuthology.orchestra.run.target-ses-017.stdout:                  ----------
2019-09-21T07:10:07.614 INFO:teuthology.orchestra.run.target-ses-017.stdout:                            ID: absent OSDs
2019-09-21T07:10:07.614 INFO:teuthology.orchestra.run.target-ses-017.stdout:                      Function: cmd.run
2019-09-21T07:10:07.615 INFO:teuthology.orchestra.run.target-ses-017.stdout:                          Name: /bin/false
2019-09-21T07:10:07.615 INFO:teuthology.orchestra.run.target-ses-017.stdout:                        Result: False
2019-09-21T07:10:07.615 INFO:teuthology.orchestra.run.target-ses-017.stdout:                       Comment: Command "/bin/false" run
2019-09-21T07:10:07.616 INFO:teuthology.orchestra.run.target-ses-017.stdout:                       Started: 07:06:25.823049
2019-09-21T07:10:07.616 INFO:teuthology.orchestra.run.target-ses-017.stdout:                      Duration: 355.913 ms
2019-09-21T07:10:07.616 INFO:teuthology.orchestra.run.target-ses-017.stdout:                       Changes:
2019-09-21T07:10:07.616 INFO:teuthology.orchestra.run.target-ses-017.stdout:                                ----------
2019-09-21T07:10:07.616 INFO:teuthology.orchestra.run.target-ses-017.stdout:                                pid:
2019-09-21T07:10:07.617 INFO:teuthology.orchestra.run.target-ses-017.stdout:                                    27311
2019-09-21T07:10:07.617 INFO:teuthology.orchestra.run.target-ses-017.stdout:                                retcode:
2019-09-21T07:10:07.617 INFO:teuthology.orchestra.run.target-ses-017.stdout:                                    1
2019-09-21T07:10:07.617 INFO:teuthology.orchestra.run.target-ses-017.stdout:                                stderr:
2019-09-21T07:10:07.618 INFO:teuthology.orchestra.run.target-ses-017.stdout:                                stdout:
2019-09-21T07:10:07.618 INFO:teuthology.orchestra.run.target-ses-017.stdout:
2019-09-21T07:10:07.618 INFO:teuthology.orchestra.run.target-ses-017.stdout:                  Summary for target-ses-017.ecp.suse.de
2019-09-21T07:10:07.618 INFO:teuthology.orchestra.run.target-ses-017.stdout:                  ------------
2019-09-21T07:10:07.618 INFO:teuthology.orchestra.run.target-ses-017.stdout:                  Succeeded: 0 (changed=1)
2019-09-21T07:10:07.619 INFO:teuthology.orchestra.run.target-ses-017.stdout:                  Failed:    1
2019-09-21T07:10:07.619 INFO:teuthology.orchestra.run.target-ses-017.stdout:                  ------------
2019-09-21T07:10:07.619 INFO:teuthology.orchestra.run.target-ses-017.stdout:                  Total states run:     1
2019-09-21T07:10:07.619 INFO:teuthology.orchestra.run.target-ses-017.stdout:                  Total run time: 355.913 ms
2019-09-21T07:10:07.619 INFO:teuthology.orchestra.run.target-ses-017.stdout:     Started: 07:06:25.518005
2019-09-21T07:10:07.620 INFO:teuthology.orchestra.run.target-ses-017.stdout:    Duration: 681.442 ms
2019-09-21T07:10:07.620 INFO:teuthology.orchestra.run.target-ses-017.stdout:     Changes:
2019-09-21T07:10:07.620 INFO:teuthology.orchestra.run.target-ses-017.stdout:  Name: Restore OSDs multiple - Function: salt.state - Result: Changed Started: - 07:06:26.199758 Duration: 26644.487 ms
2019-09-21T07:10:07.620 INFO:teuthology.orchestra.run.target-ses-017.stdout:  Name: Wait for Ceph multiple - Function: salt.state - Result: Changed Started: - 07:06:52.844667 Duration: 6671.577 ms
2019-09-21T07:10:07.620 INFO:teuthology.orchestra.run.target-ses-017.stdout:  Name: Update Destroyed - Function: salt.state - Result: Changed Started: - 07:06:59.516560 Duration: 453.233 ms
2019-09-21T07:10:07.621 INFO:teuthology.orchestra.run.target-ses-017.stdout:  Name: disengage.safety - Function: salt.runner - Result: Changed Started: - 07:06:59.970312 Duration: 313.289 ms
2019-09-21T07:10:07.621 INFO:teuthology.orchestra.run.target-ses-017.stdout:  Name: remove.osd - Function: salt.runner - Result: Clean Started: - 07:07:00.283939 Duration: 21627.422 ms
2019-09-21T07:10:07.621 INFO:teuthology.orchestra.run.target-ses-017.stdout:----------
2019-09-21T07:10:07.621 INFO:teuthology.orchestra.run.target-ses-017.stdout:          ID: Check OSDs delay
2019-09-21T07:10:07.621 INFO:teuthology.orchestra.run.target-ses-017.stdout:    Function: salt.state
2019-09-21T07:10:07.621 INFO:teuthology.orchestra.run.target-ses-017.stdout:      Result: False
2019-09-21T07:10:07.622 INFO:teuthology.orchestra.run.target-ses-017.stdout:     Comment: Run failed on minions: target-ses-017.ecp.suse.de
2019-09-21T07:10:07.622 INFO:teuthology.orchestra.run.target-ses-017.stdout:              Failures:
2019-09-21T07:10:07.622 INFO:teuthology.orchestra.run.target-ses-017.stdout:                  target-ses-017.ecp.suse.de:
2019-09-21T07:10:07.622 INFO:teuthology.orchestra.run.target-ses-017.stdout:                  ----------
2019-09-21T07:10:07.622 INFO:teuthology.orchestra.run.target-ses-017.stdout:                            ID: absent OSDs
2019-09-21T07:10:07.623 INFO:teuthology.orchestra.run.target-ses-017.stdout:                      Function: cmd.run
2019-09-21T07:10:07.623 INFO:teuthology.orchestra.run.target-ses-017.stdout:                          Name: /bin/false
2019-09-21T07:10:07.623 INFO:teuthology.orchestra.run.target-ses-017.stdout:                        Result: False
2019-09-21T07:10:07.623 INFO:teuthology.orchestra.run.target-ses-017.stdout:                       Comment: Command "/bin/false" run
2019-09-21T07:10:07.624 INFO:teuthology.orchestra.run.target-ses-017.stdout:                       Started: 07:07:22.188560
2019-09-21T07:10:07.624 INFO:teuthology.orchestra.run.target-ses-017.stdout:                      Duration: 321.858 ms
2019-09-21T07:10:07.624 INFO:teuthology.orchestra.run.target-ses-017.stdout:                       Changes:
2019-09-21T07:10:07.624 INFO:teuthology.orchestra.run.target-ses-017.stdout:                                ----------
2019-09-21T07:10:07.625 INFO:teuthology.orchestra.run.target-ses-017.stdout:                                pid:
2019-09-21T07:10:07.625 INFO:teuthology.orchestra.run.target-ses-017.stdout:                                    30242
2019-09-21T07:10:07.625 INFO:teuthology.orchestra.run.target-ses-017.stdout:                                retcode:
2019-09-21T07:10:07.625 INFO:teuthology.orchestra.run.target-ses-017.stdout:                                    1
2019-09-21T07:10:07.625 INFO:teuthology.orchestra.run.target-ses-017.stdout:                                stderr:
2019-09-21T07:10:07.626 INFO:teuthology.orchestra.run.target-ses-017.stdout:                                stdout:
2019-09-21T07:10:07.626 INFO:teuthology.orchestra.run.target-ses-017.stdout:
2019-09-21T07:10:07.626 INFO:teuthology.orchestra.run.target-ses-017.stdout:                  Summary for target-ses-017.ecp.suse.de
2019-09-21T07:10:07.626 INFO:teuthology.orchestra.run.target-ses-017.stdout:                  ------------
2019-09-21T07:10:07.626 INFO:teuthology.orchestra.run.target-ses-017.stdout:                  Succeeded: 0 (changed=1)
2019-09-21T07:10:07.626 INFO:teuthology.orchestra.run.target-ses-017.stdout:                  Failed:    1
2019-09-21T07:10:07.627 INFO:teuthology.orchestra.run.target-ses-017.stdout:                  ------------
2019-09-21T07:10:07.627 INFO:teuthology.orchestra.run.target-ses-017.stdout:                  Total states run:     1
2019-09-21T07:10:07.627 INFO:teuthology.orchestra.run.target-ses-017.stdout:                  Total run time: 321.858 ms
2019-09-21T07:10:07.627 INFO:teuthology.orchestra.run.target-ses-017.stdout:     Started: 07:07:21.911656
2019-09-21T07:10:07.627 INFO:teuthology.orchestra.run.target-ses-017.stdout:    Duration: 618.061 ms
2019-09-21T07:10:07.628 INFO:teuthology.orchestra.run.target-ses-017.stdout:     Changes:
2019-09-21T07:10:07.628 INFO:teuthology.orchestra.run.target-ses-017.stdout:  Name: Restore OSDs delay - Function: salt.state - Result: Changed Started: - 07:07:22.530000 Duration: 13199.535 ms
2019-09-21T07:10:07.628 INFO:teuthology.orchestra.run.target-ses-017.stdout:  Name: Wait for Ceph delay - Function: salt.state - Result: Changed Started: - 07:07:35.729948 Duration: 6651.363 ms
2019-09-21T07:10:07.628 INFO:teuthology.orchestra.run.target-ses-017.stdout:  Name: Update Destroyed force - Function: salt.state - Result: Changed Started: - 07:07:42.381635 Duration: 822.637 ms
2019-09-21T07:10:07.628 INFO:teuthology.orchestra.run.target-ses-017.stdout:  Name: disengage.safety - Function: salt.runner - Result: Changed Started: - 07:07:43.204702 Duration: 281.988 ms
2019-09-21T07:10:07.629 INFO:teuthology.orchestra.run.target-ses-017.stdout:  Name: remove.osd - Function: salt.runner - Result: Clean Started: - 07:07:43.487011 Duration: 20890.244 ms
2019-09-21T07:10:07.629 INFO:teuthology.orchestra.run.target-ses-017.stdout:----------
2019-09-21T07:10:07.629 INFO:teuthology.orchestra.run.target-ses-017.stdout:          ID: Check OSDs force
2019-09-21T07:10:07.629 INFO:teuthology.orchestra.run.target-ses-017.stdout:    Function: salt.state
2019-09-21T07:10:07.629 INFO:teuthology.orchestra.run.target-ses-017.stdout:      Result: False
2019-09-21T07:10:07.630 INFO:teuthology.orchestra.run.target-ses-017.stdout:     Comment: Run failed on minions: target-ses-017.ecp.suse.de
2019-09-21T07:10:07.630 INFO:teuthology.orchestra.run.target-ses-017.stdout:              Failures:
2019-09-21T07:10:07.630 INFO:teuthology.orchestra.run.target-ses-017.stdout:                  target-ses-017.ecp.suse.de:
2019-09-21T07:10:07.630 INFO:teuthology.orchestra.run.target-ses-017.stdout:                  ----------
2019-09-21T07:10:07.630 INFO:teuthology.orchestra.run.target-ses-017.stdout:                            ID: absent OSDs
2019-09-21T07:10:07.631 INFO:teuthology.orchestra.run.target-ses-017.stdout:                      Function: cmd.run
2019-09-21T07:10:07.631 INFO:teuthology.orchestra.run.target-ses-017.stdout:                          Name: /bin/false
2019-09-21T07:10:07.631 INFO:teuthology.orchestra.run.target-ses-017.stdout:                        Result: False
2019-09-21T07:10:07.631 INFO:teuthology.orchestra.run.target-ses-017.stdout:                       Comment: Command "/bin/false" run
2019-09-21T07:10:07.631 INFO:teuthology.orchestra.run.target-ses-017.stdout:                       Started: 07:08:05.027911
2019-09-21T07:10:07.632 INFO:teuthology.orchestra.run.target-ses-017.stdout:                      Duration: 382.099 ms
2019-09-21T07:10:07.632 INFO:teuthology.orchestra.run.target-ses-017.stdout:                       Changes:
2019-09-21T07:10:07.632 INFO:teuthology.orchestra.run.target-ses-017.stdout:                                ----------
2019-09-21T07:10:07.632 INFO:teuthology.orchestra.run.target-ses-017.stdout:                                pid:
2019-09-21T07:10:07.632 INFO:teuthology.orchestra.run.target-ses-017.stdout:                                    31992
2019-09-21T07:10:07.632 INFO:teuthology.orchestra.run.target-ses-017.stdout:                                retcode:
2019-09-21T07:10:07.633 INFO:teuthology.orchestra.run.target-ses-017.stdout:                                    1
2019-09-21T07:10:07.633 INFO:teuthology.orchestra.run.target-ses-017.stdout:                                stderr:
2019-09-21T07:10:07.633 INFO:teuthology.orchestra.run.target-ses-017.stdout:                                stdout:
2019-09-21T07:10:07.633 INFO:teuthology.orchestra.run.target-ses-017.stdout:
2019-09-21T07:10:07.633 INFO:teuthology.orchestra.run.target-ses-017.stdout:                  Summary for target-ses-017.ecp.suse.de
2019-09-21T07:10:07.634 INFO:teuthology.orchestra.run.target-ses-017.stdout:                  ------------
2019-09-21T07:10:07.634 INFO:teuthology.orchestra.run.target-ses-017.stdout:                  Succeeded: 0 (changed=1)
2019-09-21T07:10:07.634 INFO:teuthology.orchestra.run.target-ses-017.stdout:                  Failed:    1
2019-09-21T07:10:07.634 INFO:teuthology.orchestra.run.target-ses-017.stdout:                  ------------
2019-09-21T07:10:07.634 INFO:teuthology.orchestra.run.target-ses-017.stdout:                  Total states run:     1
2019-09-21T07:10:07.635 INFO:teuthology.orchestra.run.target-ses-017.stdout:                  Total run time: 382.099 ms
2019-09-21T07:10:07.635 INFO:teuthology.orchestra.run.target-ses-017.stdout:     Started: 07:08:04.377568
2019-09-21T07:10:07.635 INFO:teuthology.orchestra.run.target-ses-017.stdout:    Duration: 1056.056 ms
2019-09-21T07:10:07.635 INFO:teuthology.orchestra.run.target-ses-017.stdout:     Changes:
2019-09-21T07:10:07.635 INFO:teuthology.orchestra.run.target-ses-017.stdout:  Name: Restore OSDs force - Function: salt.state - Result: Changed Started: - 07:08:05.433919 Duration: 14011.729 ms
2019-09-21T07:10:07.635 INFO:teuthology.orchestra.run.target-ses-017.stdout:  Name: Wait for Ceph force - Function: salt.state - Result: Changed Started: - 07:08:19.445950 Duration: 6676.177 ms
2019-09-21T07:10:07.636 INFO:teuthology.orchestra.run.target-ses-017.stdout:  Name: verify iperf3 - Function: salt.state - Result: Changed Started: - 07:08:26.122559 Duration: 857.143 ms
2019-09-21T07:10:07.636 INFO:teuthology.orchestra.run.target-ses-017.stdout:
2019-09-21T07:10:07.636 INFO:teuthology.orchestra.run.target-ses-017.stdout:Summary for target-ses-017.ecp.suse.de_master
2019-09-21T07:10:07.636 INFO:teuthology.orchestra.run.target-ses-017.stdout:--------------
2019-09-21T07:10:07.636 INFO:teuthology.orchestra.run.target-ses-017.stdout:Succeeded: 114 (changed=107)
2019-09-21T07:10:07.637 INFO:teuthology.orchestra.run.target-ses-017.stdout:Failed:      3
2019-09-21T07:10:07.637 INFO:teuthology.orchestra.run.target-ses-017.stdout:--------------
2019-09-21T07:10:07.637 INFO:teuthology.orchestra.run.target-ses-017.stdout:Total states run:     117
2019-09-21T07:10:07.637 INFO:teuthology.orchestra.run.target-ses-017.stdout:Total run time:   438.491 s
2019-09-21T07:10:07.648 DEBUG:teuthology.orchestra.run:got remote process result: 1

Full log: http://ci.ses.suse.de:8080/job/storage-ci-teuthology-suite-devel-5.0/292/testReport/teuthology/Devel/suse_tier1_functional__0_salt_yaml_1_deepsea_yaml_2_services_and_test_yaml_ceph_cm_salt_yaml_cluster__4disks_yaml_roles_yaml__deepsea_cli_off_yaml_distros_sle_12_3_yaml_/attachments/teuthology-suse-1311.log

smithfarm commented 4 years ago

@kshtsk tells me that this failure has been keeping the devel-storage-5.0 CI red for a month now. Can we raise the priority?

swiftgist commented 4 years ago

I just ran salt-run state.orch ceph.functests.1node.remove successfully. Is this environment with the failure accessible?

smithfarm commented 4 years ago

@swiftgist The CI always destroys the environment. Reproducing would involve using the same command (salt-run state.orch ceph.functests.1node) in the same environment (SLE-15-SP1, deepsea 0.8.12, single-node cluster). Is that what you did?

swiftgist commented 4 years ago

I ran salt-run state.orch ceph.functests.1node.remove. I do have a multinode cluster. How many OSDs do you have in your single node cluster? The multiple OSD removal test removes both OSD 0 and 1.

kshtsk commented 4 years ago

we actually try and deploy the system

smithfarm commented 4 years ago

@swiftgist Possibly you are not only running on a multinode cluster, but you might also be running on a newer version of DeepSea - I see the same test succeeds in the CI when run on the tip of the SES5 branch (v0.8.12-12-g162e1139).

@jschmid1 Can we get an incremental release pushed to Devel:Storage:5.0 to fix the breakage there?

swiftgist commented 4 years ago

found the fix after experimenting on the teuthology VM. It's already in #1724.

smithfarm commented 4 years ago

1724 merge commit is 6dff6ca

$ git describe 6dff6ca
v0.8.12-2-g6dff6ca9

@swiftgist Right. Devel:Storage:5.0 still has DeepSea version v0.8.12-0-e885d513e, hence the CI breakage and this bug report. Thanks for fixing it and looking into it.

smithfarm commented 4 years ago

The moral of this story, I guess, is to make sure the CI is passing on the tip of the release branch before cutting a release.

Also, after pushing to Devel:Storage:* I do not submit to maintenance immediately, but rather wait for a CI run in the devel CI and only submit once the CI is green with the new release.