ceph / ceph-medic

find common issues in ceph clusters
MIT License
22 stars 18 forks source link

checks: warn on non-standard OSD reweighs with `ceph osd tree` #135

Open alfredodeza opened 5 years ago

alfredodeza commented 5 years ago

Todo: Figure out what a "non-standard reweigh" is

jeanchlopez commented 5 years ago

See my other comment on https://github.com/ceph/ceph-medic/issues/136

Here what we'll be checking is not the CRUSH weight (harder to track because it needs to be established based on the drive type and capacity) but the OSD reweight factor (modified with the ceph osd reweight command and easilly seen in the ceph osd tree style command output (last but one column).

jeanchlopez commented 5 years ago

The following is what you want to walk through for the OSD reweight factor

ceph osd tree -f json | jq '.nodes[] | {id, reweight}'

alfredodeza commented 5 years ago

I see that in a vanilla Ceph cluster the OSDs have a reweight of 1. What should I be looking for? something that is different than the others? What would a "non-standard reweigh" be?

I re-read the comments on #136 but not clear on how to apply that for this issue

jeanchlopez commented 5 years ago

Any value different than 1. This will indicate some manual rebalancing has been attempted.

JC

While moving. Excuse unintended typos.

On Aug 6, 2019, at 08:32, Alfredo Deza notifications@github.com wrote:

I see that in a vanilla Ceph cluster the OSDs have a reweight of 1. What should I be looking for? something that is different than the others? What would a "non-standard reweigh" be?

I re-read the comments on #136 but not clear on how to apply that for this issue

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.