canonical / charmed-openstack-upgrader

Automatic upgrade tool for Charmed Openstack
Apache License 2.0
3 stars 12 forks source link

ceph-mon is failing to show the right version of osd and this can be problematic #401

Open gabrielcocenza opened 5 months ago

gabrielcocenza commented 5 months ago

For some unknown reason, after upgrading all ceph-osds to pacific on partner cloud, ceph-mon still says that osd's are on octopus.

E.g:

{
    "mon": {
        "ceph version 16.2.14 (238ba602515df21ea7ffc75c88db29f9e5ef12c9) pacific (stable)": 3
    },
    "mgr": {
        "ceph version 16.2.14 (238ba602515df21ea7ffc75c88db29f9e5ef12c9) pacific (stable)": 3
    },
    "osd": {
        "ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)": 18
    },
    "mds": {},
    "rgw": {
        "ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)": 3
    },
    "overall": {
        "ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)": 21,
        "ceph version 16.2.14 (238ba602515df21ea7ffc75c88db29f9e5ef12c9) pacific (stable)": 6
    }
}

This is problematic because it pass the set_require_osd_release_option once that the require-osd-release is also on octopusand this will let the cloud continue upgrading. I guess the right thing is to halt the execution if we see that there is more than one ceph version in the cloud.

gabrielcocenza commented 5 months ago

After rebooting a ceph-osd machine (not necessary) and restarting the services using sudo systemctl restart ceph.target, ceph-osd versions got updated like in the payload:

{
    "mon": {
        "ceph version 16.2.14 (238ba602515df21ea7ffc75c88db29f9e5ef12c9) pacific (stable)": 2,
        "ceph version 16.2.15 (618f440892089921c3e944a991122ddc44e60516) pacific (stable)": 1
    },
    "mgr": {
        "ceph version 16.2.14 (238ba602515df21ea7ffc75c88db29f9e5ef12c9) pacific (stable)": 2,
        "ceph version 16.2.15 (618f440892089921c3e944a991122ddc44e60516) pacific (stable)": 1
    },
    "osd": {
        "ceph version 16.2.15 (618f440892089921c3e944a991122ddc44e60516) pacific (stable)": 18
    },
    "mds": {},
    "rgw": {
        "ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)": 2,
        "ceph version 16.2.14 (238ba602515df21ea7ffc75c88db29f9e5ef12c9) pacific (stable)": 1
    },
    "overall": {
        "ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)": 2,
        "ceph version 16.2.14 (238ba602515df21ea7ffc75c88db29f9e5ef12c9) pacific (stable)": 5,
        "ceph version 16.2.15 (618f440892089921c3e944a991122ddc44e60516) pacific (stable)": 20
    }
}

rgw had the same issue as osd

valexby commented 4 months ago

Hi, also was affected by this one during Octopus -> Pacific upgrade of ceph-osd during the last step of Victoria -> Wallaby upgrade. Reported ceph-osd charm LP2068151 bug also, as it is not clear why charm hasn't done any restarts.