canonical / charm-microceph

Charm to deploy/manage microceph
Apache License 2.0
2 stars 9 forks source link

Intermittent CI failures #85

Open hemanthnakkina opened 1 month ago

hemanthnakkina commented 1 month ago

I have observed intermittent CI failures multiple times during run of PR#83 However rerunning the failed jobs multiple times ultimately lead to success. This bug is to analyse those errros and fix them.

  1. Juju cluster test, Juju Upgrade test failed in Install Microceph charm step

    https://github.com/canonical/charm-microceph/actions/runs/9415519920/job/25936921291 https://github.com/canonical/charm-microceph/actions/runs/9425209140/job/25967589299 The problem seems to be microceph failing in joining the node. The token string seems legitimate wrt IPs. Collected logs: microceph_juju_upgrade_test_logs.zip

  2. Juju Upgraed test failed in Test successfull upgrade step

    https://github.com/canonical/charm-microceph/actions/runs/9425209140/job/25970001521 Upgrade failed with error Upgrade on microceph/0 to reef/candidate failed: HEALTH_WARN, {'MON_DOWN': {'severity': 'HEALTH_WARN', 'su... Collected logs: microceph_juju_upgrade_test_logs.zip

UtkarshBhatthere commented 1 month ago

Thank you for the detailed bug report @hemanthnakkina.

hemanthnakkina commented 2 weeks ago

The first issue seems like resolved by https://github.com/canonical/charm-microceph/pull/88. I do not see CI failures in joining microcpeh nodes.