Open marosg42 opened 2 months ago
@javacruft I believe the issue here is that the gnocchi charm requests too large a pg_num, it probably should adapt it's pg_num request based on the number of available OSDs, what do you think?
Sorry for nitpicking but this should be a charm-microceph issue.
After digging into this more with @UtkarshBhatthere - the problem happens when number of added pgs per OSD while creating a pool is bigger than mon_max_pg_per_osd
. And this number grows by 32 after each pool is added. Here is an example
ubuntu@cractus:~$ sudo ceph --id admin osd pool create --pg-num-min=32 mg6 64
pool 'mg6' created
ubuntu@cractus:~$ sudo ceph --id admin osd pool create --pg-num-min=32 mg7 64
Error ERANGE: pg_num 64 size 3 for this pool would result in 368 cumulative PGs per OSD (2211 total PG replicas on 6 'in' root OSDs by crush rule) which exceeds the mon_max_pg_per_osd value of 336
ubuntu@cractus:~$ sudo ceph tell mon.* config set mon_max_pg_per_osd 368
ubuntu@cractus:~$ sudo ceph --id admin osd pool create --pg-num-min=32 mg7 64
pool 'mg7' created
ubuntu@cractus:~$ sudo ceph --id admin osd pool create --pg-num-min=32 mg8 64
Error ERANGE: pg_num 64 size 3 for this pool would result in 400 cumulative PGs per OSD (2403 total PG replicas on 6 'in' root OSDs by crush rule) which exceeds the mon_max_pg_per_osd value of 368
ubuntu@cractus:~$ sudo ceph tell mon.* config set mon_max_pg_per_osd 400
ubuntu@cractus:~$ sudo ceph --id admin osd pool create --pg-num-min=32 mg8 64
pool 'mg8' created
ubuntu@cractus:~$ sudo ceph --id admin osd pool create --pg-num-min=32 mg9 64
Error ERANGE: pg_num 64 size 3 for this pool would result in 432 cumulative PGs per OSD (2595 total PG replicas on 6 'in' root OSDs by crush rule) which exceeds the mon_max_pg_per_osd value of 400
ubuntu@cractus:~$ sudo ceph tell mon.* config set mon_max_pg_per_osd 432
ubuntu@cractus:~$ sudo ceph --id admin osd pool create --pg-num-min=32 mg9 64
pool 'mg9' created
Not sure how/whether this is related byt it is certainly confusing
ubuntu@cractus:~$ sudo ceph tell mon.0 config get mon_max_pg_per_osd
{
"mon_max_pg_per_osd": "464"
}
ubuntu@cractus:~$ sudo ceph status
cluster:
id: 1467f16f-8bbd-45af-b67c-0e0fd2db12dd
health: HEALTH_WARN
Reduced data availability: 7 pgs inactive
too many PGs per OSD (464 > max 250)
ubuntu@cractus:~$ sudo ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
1 0.27280 1.00000 279 GiB 514 MiB 417 MiB 0 B 97 MiB 279 GiB 0.18 0.74 280 up
6 0.36389 1.00000 373 GiB 1013 MiB 951 MiB 0 B 62 MiB 372 GiB 0.27 1.09 358 up
5 0.90970 1.00000 932 GiB 2.2 GiB 2.1 GiB 0 B 144 MiB 929 GiB 0.24 0.97 760 up
4 0.43669 1.00000 447 GiB 1.3 GiB 1.2 GiB 0 B 74 MiB 446 GiB 0.28 1.17 428 up
3 0.21829 1.00000 224 GiB 617 MiB 526 MiB 0 B 91 MiB 223 GiB 0.27 1.11 228 up
2 0.87129 1.00000 892 GiB 2.1 GiB 2.0 GiB 0 B 105 MiB 890 GiB 0.23 0.96 733 up
TOTAL 3.1 TiB 7.6 GiB 7.1 GiB 0 B 572 MiB 3.1 TiB 0.24
MIN/MAX VAR: 0.74/1.17 STDDEV: 0.03
ubuntu@cractus:~$ sudo ceposd pool autoscale-status
POOL SIZE TARGET SIZE RATE RAW CAPACITY RATIO TARGET RATIO EFFECTIVE RATIO BIAS PG_NUM NEW PG_NUM AUTOSCALE BULK
.rgw.root 1386 3.0 3146G 0.0000 1.0 32 on False
default.rgw.log 3702 3.0 3146G 0.0000 1.0 32 on False
.mgr 576.5k 3.0 3146G 0.0000 1.0 1 on False
default.rgw.control 0 3.0 3146G 0.0000 1.0 32 on False
default.rgw.meta 0 3.0 3146G 0.0000 4.0 32 on False
cinder-ceph 0 3.0 3146G 0.3333 0.4000 0.3333 1.0 128 on False
glance 2399M 3.0 3146G 0.3333 0.4000 0.3333 1.0 32 on False
gnocchi 14 3.0 3146G 0.3333 0.4000 0.3333 1.0 64 on False
mg 0 3.0 3146G 0.0000 1.0 64 on False
mg1 0 3.0 3146G 0.0000 1.0 64 on False
mg2 0 3.0 3146G 0.0000 1.0 64 on False
mg4 0 3.0 3146G 0.0000 1.0 64 on False
mg6 0 3.0 3146G 0.0000 1.0 64 on False
mg7 0 3.0 3146G 0.0000 1.0 64 on False
mg8 0 3.0 3146G 0.0000 1.0 64 on False
mg9 0 3.0 3146G 0.0000 1.0 64 on False
mg10 0 3.0 3146G 0.0000 1.0 64 on False
For reference here is how crossing the threshold for mon_max_pg_per_osd
is calculated. The algo projects up how many pgs each pool could have and sums up the pool target pg numbers, then checks if that passes the mon_max_pg_per_osd
threshold.
Similar test just to confirm it has nothing to do with different disk sizes like were present in the original issue. Here are disks of the same size, same behavior.
ubuntu@solqa-lab1-server-10:~$ sudo ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
1 0.21829 1.00000 224 GiB 1.2 GiB 907 MiB 0 B 344 MiB 222 GiB 0.55 0.96 149 up
2 0.21829 1.00000 224 GiB 1.4 GiB 1017 MiB 0 B 404 MiB 222 GiB 0.62 1.09 158 up
3 0.21829 1.00000 224 GiB 1.4 GiB 958 MiB 0 B 510 MiB 222 GiB 0.64 1.13 173 up
4 0.21829 1.00000 224 GiB 1.2 GiB 861 MiB 0 B 356 MiB 222 GiB 0.53 0.94 154 up
5 0.21829 1.00000 224 GiB 1.2 GiB 845 MiB 0 B 379 MiB 222 GiB 0.53 0.94 159 up
6 0.21829 1.00000 224 GiB 1.2 GiB 762 MiB 0 B 449 MiB 222 GiB 0.53 0.93 170 up
TOTAL 1.3 TiB 7.6 GiB 5.2 GiB 0 B 2.4 GiB 1.3 TiB 0.57
MIN/MAX VAR: 0.93/1.13 STDDEV: 0.05
ubuntu@solqa-lab1-server-10:~$ sudo ceph --id admin osd pool create --pg-num-min=32 mg1 64
pool 'mg1' created
ubuntu@solqa-lab1-server-10:~$ sudo ceph --id admin osd pool create --pg-num-min=32 mg2 64
pool 'mg2' created
ubuntu@solqa-lab1-server-10:~$ sudo ceph --id admin osd pool create --pg-num-min=32 mg3 64
Error ERANGE: pg_num 64 size 3 for this pool would result in 256 cumulative PGs per OSD (1539 total PG replicas on 6 'in' root OSDs by crush rule) which exceeds the mon_max_pg_per_osd value of 250
ubuntu@solqa-lab1-server-10:~$ sudo ceph tell mon.* config set mon_max_pg_per_osd 256
mon.solqa-lab1-server-10: {
"success": "mon_max_pg_per_osd = '256' (not observed, change may require restart) "
}
mon.solqa-lab1-server-12: {
"success": "mon_max_pg_per_osd = '256' (not observed, change may require restart) "
}
mon.solqa-lab1-server-13: {
"success": "mon_max_pg_per_osd = '256' (not observed, change may require restart) "
}
ubuntu@solqa-lab1-server-10:~$ sudo ceph --id admin osd pool create --pg-num-min=32 mg3 64
pool 'mg3' created
ubuntu@solqa-lab1-server-10:~$ sudo ceph --id admin osd pool create --pg-num-min=32 mg4 64
Error ERANGE: pg_num 64 size 3 for this pool would result in 288 cumulative PGs per OSD (1731 total PG replicas on 6 'in' root OSDs by crush rule) which exceeds the mon_max_pg_per_osd value of 256
ubuntu@solqa-lab1-server-10:~$ sudo ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
1 0.21829 1.00000 224 GiB 1.2 GiB 907 MiB 0 B 344 MiB 222 GiB 0.55 0.96 262 up
2 0.21829 1.00000 224 GiB 1.4 GiB 1017 MiB 0 B 404 MiB 222 GiB 0.62 1.09 252 up
3 0.21829 1.00000 224 GiB 1.4 GiB 958 MiB 0 B 510 MiB 222 GiB 0.64 1.13 265 up
4 0.21829 1.00000 224 GiB 1.2 GiB 861 MiB 0 B 356 MiB 222 GiB 0.53 0.94 249 up
5 0.21829 1.00000 224 GiB 1.2 GiB 845 MiB 0 B 383 MiB 222 GiB 0.54 0.94 254 up
6 0.21829 1.00000 224 GiB 1.2 GiB 762 MiB 0 B 449 MiB 222 GiB 0.53 0.93 257 up
TOTAL 1.3 TiB 7.6 GiB 5.2 GiB 0 B 2.4 GiB 1.3 TiB 0.57
MIN/MAX VAR: 0.93/1.13 STDDEV: 0.05
Issue report
What version of MicroCeph are you using ?
reef/beta with sunbeam 2024.1/beta
What are the steps to reproduce this issue ?
Enable several features in sunbeam, usualy enabling observability cause an error
https://bugs.launchpad.net/snap-openstack/+bug/2073734
What happens (observed behaviour) ?
sunbeam enable observability
fails on timeout waiting for openstack model. There are ceilometer and gnocchi units in waiting statusThe reason is that ceph pool is not created
What were you expecting to happen ?
gnocchi pool is created and sunbeam is happy
Additional comments.
As a workaround I use
sudo ceph tell mon.* config set mon_max_pg_per_osd 350
…