canonical / microceph

MicroCeph is snap-deployed Ceph with built-in clustering
https://snapcraft.io/microceph
GNU Affero General Public License v3.0
213 stars 33 forks source link

ceph pool is not created when running sunbeam enable observability #393

Open marosg42 opened 2 months ago

marosg42 commented 2 months ago

Issue report

What version of MicroCeph are you using ?

reef/beta with sunbeam 2024.1/beta

What are the steps to reproduce this issue ?

Enable several features in sunbeam, usualy enabling observability cause an error

https://bugs.launchpad.net/snap-openstack/+bug/2073734

What happens (observed behaviour) ?

sunbeam enable observability fails on timeout waiting for openstack model. There are ceilometer and gnocchi units in waiting status

ceilometer/0 waiting idle 10.1.166.33 (workload) Leader not ready
ceilometer/1 waiting idle 10.1.1.97 (workload) Leader not ready
ceilometer/2* waiting idle 10.1.97.24 (workload) Not all relations are ready
gnocchi/0 waiting idle 10.1.166.37 (workload) Leader not ready
gnocchi/1* waiting idle 10.1.237.18 (workload) Not all relations are ready
gnocchi/2 waiting idle 10.1.189.155 (workload) Leader not ready

The reason is that ceph pool is not created

unit-microceph-0: 16:33:43 WARNING unit.microceph/0.ceph-relation-changed Error ERANGE: pg_num 64 size 3 for this pool would result in 272 cumulative PGs per OSD (1635 total PG replicas on 6 'in' root OSDs by crush rule) which exceeds the mon_max_pg_per_osd value of 250
unit-microceph-0: 16:33:43 ERROR unit.microceph/0.juju-log ceph:14: Command '['ceph', '--id', 'admin', 'osd', 'pool', 'create', '--pg-num-min=32', 'gnocchi', '64']' returned non-zero exit status 34.
unit-microceph-0: 16:33:43 ERROR unit.microceph/0.juju-log ceph:14: Unexpected error occurred while processing requests: {'api-version': 1, 'ops': [{'op': 'create-pool', 'name': 'gnocchi', 'replicas': 3, 'pg_num': None, 'crush-profile': None, 'app-name': 'rbd', 'compression-algorithm': None, 'compression-mode': None, 'compression-required-ratio': None, 'compression-min-blob-size': None, 'compression-min-blob-size-hdd': None, 'compression-min-blob-size-ssd': None, 'compression-max-blob-size': None, 'compression-max-blob-size-hdd': None, 'compression-max-blob-size-ssd': None, 'group': None, 'max-bytes': None, 'max-objects': None, 'group-namespace': None, 'rbd-mirroring-mode': 'pool', 'weight': 40}], 'request-id': '441182d1e20a95f745487994768a3e33a000eba0'}
unit-microceph-0: 16:33:43 INFO unit.microceph/0.juju-log ceph:14: {"exit-code": 1, "stderr": "Unexpected error occurred while processing requests: {'api-version': 1, 'ops': [{'op': 'create-pool', 'name': 'gnocchi', 'replicas': 3, 'pg_num': None, 'crush-profile': None, 'app-name': 'rbd', 'compression-algorithm': None, 'compression-mode': None, 'compression-required-ratio': None, 'compression-min-blob-size': None, 'compression-min-blob-size-hdd': None, 'compression-min-blob-size-ssd': None, 'compression-max-blob-size': None, 'compression-max-blob-size-hdd': None, 'compression-max-blob-size-ssd': None, 'group': None, 'max-bytes': None, 'max-objects': None, 'group-namespace': None, 'rbd-mirroring-mode': 'pool', 'weight': 40}], 'request-id': '441182d1e20a95f745487994768a3e33a000eba0'}"}

What were you expecting to happen ?

gnocchi pool is created and sunbeam is happy

Additional comments.

As a workaround I use sudo ceph tell mon.* config set mon_max_pg_per_osd 350

sabaini commented 2 months ago

@javacruft I believe the issue here is that the gnocchi charm requests too large a pg_num, it probably should adapt it's pg_num request based on the number of available OSDs, what do you think?

UtkarshBhatthere commented 2 months ago

Sorry for nitpicking but this should be a charm-microceph issue.

marosg42 commented 2 months ago

After digging into this more with @UtkarshBhatthere - the problem happens when number of added pgs per OSD while creating a pool is bigger than mon_max_pg_per_osd. And this number grows by 32 after each pool is added. Here is an example

ubuntu@cractus:~$ sudo ceph --id admin osd pool create --pg-num-min=32 mg6 64
pool 'mg6' created
ubuntu@cractus:~$ sudo ceph --id admin osd pool create --pg-num-min=32 mg7 64
Error ERANGE:  pg_num 64 size 3 for this pool would result in 368 cumulative PGs per OSD (2211 total PG replicas on 6 'in' root OSDs by crush rule) which exceeds the mon_max_pg_per_osd value of 336
ubuntu@cractus:~$ sudo ceph tell mon.* config set mon_max_pg_per_osd 368
ubuntu@cractus:~$ sudo ceph --id admin osd pool create --pg-num-min=32 mg7 64
pool 'mg7' created
ubuntu@cractus:~$ sudo ceph --id admin osd pool create --pg-num-min=32 mg8 64
Error ERANGE:  pg_num 64 size 3 for this pool would result in 400 cumulative PGs per OSD (2403 total PG replicas on 6 'in' root OSDs by crush rule) which exceeds the mon_max_pg_per_osd value of 368
ubuntu@cractus:~$ sudo ceph tell mon.* config set mon_max_pg_per_osd 400
ubuntu@cractus:~$ sudo ceph --id admin osd pool create --pg-num-min=32 mg8 64
pool 'mg8' created
ubuntu@cractus:~$ sudo ceph --id admin osd pool create --pg-num-min=32 mg9 64
Error ERANGE:  pg_num 64 size 3 for this pool would result in 432 cumulative PGs per OSD (2595 total PG replicas on 6 'in' root OSDs by crush rule) which exceeds the mon_max_pg_per_osd value of 400
ubuntu@cractus:~$ sudo ceph tell mon.* config set mon_max_pg_per_osd 432
ubuntu@cractus:~$ sudo ceph --id admin osd pool create --pg-num-min=32 mg9 64
pool 'mg9' created

Not sure how/whether this is related byt it is certainly confusing

ubuntu@cractus:~$ sudo ceph tell mon.0 config get mon_max_pg_per_osd
{
    "mon_max_pg_per_osd": "464"
}
ubuntu@cractus:~$ sudo ceph status
  cluster:
    id:     1467f16f-8bbd-45af-b67c-0e0fd2db12dd
    health: HEALTH_WARN
            Reduced data availability: 7 pgs inactive
            too many PGs per OSD (464 > max 250)
ubuntu@cractus:~$ sudo ceph osd df                                                                                  
ID  CLASS  WEIGHT   REWEIGHT  SIZE     RAW USE   DATA     OMAP  META     AVAIL    %USE  VAR   PGS  STATUS
 1         0.27280   1.00000  279 GiB   514 MiB  417 MiB   0 B   97 MiB  279 GiB  0.18  0.74  280      up
 6         0.36389   1.00000  373 GiB  1013 MiB  951 MiB   0 B   62 MiB  372 GiB  0.27  1.09  358      up
 5         0.90970   1.00000  932 GiB   2.2 GiB  2.1 GiB   0 B  144 MiB  929 GiB  0.24  0.97  760      up
 4         0.43669   1.00000  447 GiB   1.3 GiB  1.2 GiB   0 B   74 MiB  446 GiB  0.28  1.17  428      up          
 3         0.21829   1.00000  224 GiB   617 MiB  526 MiB   0 B   91 MiB  223 GiB  0.27  1.11  228      up          
 2         0.87129   1.00000  892 GiB   2.1 GiB  2.0 GiB   0 B  105 MiB  890 GiB  0.23  0.96  733      up
                       TOTAL  3.1 TiB   7.6 GiB  7.1 GiB   0 B  572 MiB  3.1 TiB  0.24                   
MIN/MAX VAR: 0.74/1.17  STDDEV: 0.03
ubuntu@cractus:~$ sudo ceposd pool autoscale-status
POOL                   SIZE  TARGET SIZE  RATE  RAW CAPACITY   RATIO  TARGET RATIO  EFFECTIVE RATIO  BIAS  PG_NUM  NEW PG_NUM  AUTOSCALE  BULK   
.rgw.root             1386                 3.0         3146G  0.0000                                  1.0      32              on         False  
default.rgw.log       3702                 3.0         3146G  0.0000                                  1.0      32              on         False  
.mgr                 576.5k                3.0         3146G  0.0000                                  1.0       1              on         False  
default.rgw.control      0                 3.0         3146G  0.0000                                  1.0      32              on         False  
default.rgw.meta         0                 3.0         3146G  0.0000                                  4.0      32              on         False  
cinder-ceph              0                 3.0         3146G  0.3333        0.4000           0.3333   1.0     128              on         False  
glance                2399M                3.0         3146G  0.3333        0.4000           0.3333   1.0      32              on         False  
gnocchi                 14                 3.0         3146G  0.3333        0.4000           0.3333   1.0      64              on         False  
mg                       0                 3.0         3146G  0.0000                                  1.0      64              on         False  
mg1                      0                 3.0         3146G  0.0000                                  1.0      64              on         False  
mg2                      0                 3.0         3146G  0.0000                                  1.0      64              on         False  
mg4                      0                 3.0         3146G  0.0000                                  1.0      64              on         False  
mg6                      0                 3.0         3146G  0.0000                                  1.0      64              on         False  
mg7                      0                 3.0         3146G  0.0000                                  1.0      64              on         False  
mg8                      0                 3.0         3146G  0.0000                                  1.0      64              on         False  
mg9                      0                 3.0         3146G  0.0000                                  1.0      64              on         False  
mg10                     0                 3.0         3146G  0.0000                                  1.0      64              on         False  
sabaini commented 2 months ago

For reference here is how crossing the threshold for mon_max_pg_per_osd is calculated. The algo projects up how many pgs each pool could have and sums up the pool target pg numbers, then checks if that passes the mon_max_pg_per_osd threshold.

marosg42 commented 2 months ago

Similar test just to confirm it has nothing to do with different disk sizes like were present in the original issue. Here are disks of the same size, same behavior.

ubuntu@solqa-lab1-server-10:~$ sudo ceph osd df
ID  CLASS  WEIGHT   REWEIGHT  SIZE     RAW USE  DATA      OMAP  META     AVAIL    %USE  VAR   PGS  STATUS
 1         0.21829   1.00000  224 GiB  1.2 GiB   907 MiB   0 B  344 MiB  222 GiB  0.55  0.96  149      up
 2         0.21829   1.00000  224 GiB  1.4 GiB  1017 MiB   0 B  404 MiB  222 GiB  0.62  1.09  158      up
 3         0.21829   1.00000  224 GiB  1.4 GiB   958 MiB   0 B  510 MiB  222 GiB  0.64  1.13  173      up
 4         0.21829   1.00000  224 GiB  1.2 GiB   861 MiB   0 B  356 MiB  222 GiB  0.53  0.94  154      up
 5         0.21829   1.00000  224 GiB  1.2 GiB   845 MiB   0 B  379 MiB  222 GiB  0.53  0.94  159      up
 6         0.21829   1.00000  224 GiB  1.2 GiB   762 MiB   0 B  449 MiB  222 GiB  0.53  0.93  170      up
                       TOTAL  1.3 TiB  7.6 GiB   5.2 GiB   0 B  2.4 GiB  1.3 TiB  0.57                   
MIN/MAX VAR: 0.93/1.13  STDDEV: 0.05

ubuntu@solqa-lab1-server-10:~$ sudo ceph --id admin osd pool create --pg-num-min=32 mg1 64
pool 'mg1' created
ubuntu@solqa-lab1-server-10:~$ sudo ceph --id admin osd pool create --pg-num-min=32 mg2 64
pool 'mg2' created
ubuntu@solqa-lab1-server-10:~$ sudo ceph --id admin osd pool create --pg-num-min=32 mg3 64
Error ERANGE:  pg_num 64 size 3 for this pool would result in 256 cumulative PGs per OSD (1539 total PG replicas on 6 'in' root OSDs by crush rule) which exceeds the mon_max_pg_per_osd value of 250
ubuntu@solqa-lab1-server-10:~$ sudo ceph tell mon.* config set mon_max_pg_per_osd 256
mon.solqa-lab1-server-10: {
    "success": "mon_max_pg_per_osd = '256' (not observed, change may require restart) "
}
mon.solqa-lab1-server-12: {
    "success": "mon_max_pg_per_osd = '256' (not observed, change may require restart) "
}
mon.solqa-lab1-server-13: {
    "success": "mon_max_pg_per_osd = '256' (not observed, change may require restart) "
}
ubuntu@solqa-lab1-server-10:~$ sudo ceph --id admin osd pool create --pg-num-min=32 mg3 64
pool 'mg3' created
ubuntu@solqa-lab1-server-10:~$ sudo ceph --id admin osd pool create --pg-num-min=32 mg4 64
Error ERANGE:  pg_num 64 size 3 for this pool would result in 288 cumulative PGs per OSD (1731 total PG replicas on 6 'in' root OSDs by crush rule) which exceeds the mon_max_pg_per_osd value of 256
ubuntu@solqa-lab1-server-10:~$ sudo ceph osd df
ID  CLASS  WEIGHT   REWEIGHT  SIZE     RAW USE  DATA      OMAP  META     AVAIL    %USE  VAR   PGS  STATUS
 1         0.21829   1.00000  224 GiB  1.2 GiB   907 MiB   0 B  344 MiB  222 GiB  0.55  0.96  262      up
 2         0.21829   1.00000  224 GiB  1.4 GiB  1017 MiB   0 B  404 MiB  222 GiB  0.62  1.09  252      up
 3         0.21829   1.00000  224 GiB  1.4 GiB   958 MiB   0 B  510 MiB  222 GiB  0.64  1.13  265      up
 4         0.21829   1.00000  224 GiB  1.2 GiB   861 MiB   0 B  356 MiB  222 GiB  0.53  0.94  249      up
 5         0.21829   1.00000  224 GiB  1.2 GiB   845 MiB   0 B  383 MiB  222 GiB  0.54  0.94  254      up
 6         0.21829   1.00000  224 GiB  1.2 GiB   762 MiB   0 B  449 MiB  222 GiB  0.53  0.93  257      up
                       TOTAL  1.3 TiB  7.6 GiB   5.2 GiB   0 B  2.4 GiB  1.3 TiB  0.57                   
MIN/MAX VAR: 0.93/1.13  STDDEV: 0.05