Mdstat - Githubissues

StarvingMarvin commented 8 months ago

My stab at parsing mdstat file in relation to #278 . Code seems to be working as intended, but is a bit messy. I'm opening a pull request to gather feedback before sinking too much time cleaning things up "the wrong way".

I abstained from adding regex dependency. It would definitely make some things easier, but the format is so full of nested optional bits that it wouldn't clear up all the messiness of parsing it. I see from support.md that there are many unsupported files. I'm not sure if it's because of lack of interest for them, or they are also tricky to parse? If later, maybe going for something like logos would be even more useful than just regex.

Another thing is that there are many output states that seem possible from looking at the code generating it, but I'm not sure how to produce them, so there are lot of places where the parser is coded from my understanding of the generating code, rather than any example I've found, or managed to reproduce.

eminence commented 7 months ago

Thanks for starting work on this!

I see from support.md that there are many unsupported files. I'm not sure if it's because of lack of interest for them, or they are also tricky to parse?

Sometimes it's a mix of both. I personally don't have a need for mdstat parsing (my machines don't have MD disks), but I'd like procfs to eventually handle everything, so I welcome community contributions like this.

I'm not familiar with the /proc/mdstat format, so I won't comment on the parsing code for that today. But I will say that, in general, code straight-forwardness is a priority over runtime parsing speed. I'm not against adding a dependency like regex or logos, but if the parsing code is simple enough to do ourselves, that would be my first preference.

Another thing is that there are many output states that seem possible from looking at the code generating it, but I'm not sure how to produce them, so there are lot of places where the parser is coded from my understanding of the generating code, rather than any example I've found, or managed to reproduce.

Understood, this is a challenge. If you know what code is generating the mdstat output, including references to that would be very useful

StarvingMarvin commented 7 months ago

I'm not familiar with the /proc/mdstat format, so I won't comment on the parsing code for that today. But I will say that, in general, code straight-forwardness is a priority over runtime parsing speed. I'm not against adding a dependency like regex or logos, but if the parsing code is simple enough to do ourselves, that would be my first preference.

That makes sense. I can rework the code with that in mind. Another question I have is: to what extent should parsing be strict? Should the code bail-out at the first sign of trouble, or return info on the best-effort basis or something in between?

Understood, this is a challenge. If you know what code is generating the mdstat output, including references to that would be very useful

In general, method referenced in an original issue is en entry point for generating a mdstat file. Just going through all the if branches that can produce some output, we can see that there can apparently be a md device with no "personality" aka a raid type, or one without disks. The other thing I'm not sure how to do is non-persistant or external super block. Finally, at some point part of the output is delegated to the personality itself, which is for the specific raid types included as a part of an md driver easy to check, but I'm not sure if there are other stuff or even off-tree modules that plug themselves there and output whatever info they want.

For testing purposes, this can create a raid to play arround with:

N_DISKS=5
for i in $(seq $N_DISKS);
do
    fallocate -l 1G loop$i.img # create an empty 1G file
    losetup -f loop$i.img  # create a loopback device from it (-f uses the first unallocated loopback number)
done

mdadm --create /dev/md0 --level=LEVEL --raid-devices=N_DISKS /dev/loop{F..F+N}`

eminence commented 6 months ago

Another question I have is: to what extent should parsing be strict? Should the code bail-out at the first sign of trouble, or return info on the best-effort basis or something in between?

Good question. This is mostly a judgement call, and it takes into account the complicatedness of the format, how the data would be represented in procfs, how the data would be used, etc.

A reasonable example to look at might be the CpuInfo struct, where most of the data is stored in a HashMap. Except for some well-known fields (like model name), no attempt is made to try to handle every possible value, and instead everything is just stored into a HashMap and anyone using a CpuInfo needs to do their own work to figure out what values it should care about.

My general approach has been: if a procfs file that we are parsing has an unexpected format, then that is a bug in procfs, to be handled with InternalError. If we think it's too hard to parse all of the expected values (like if we're not sure we can enumerate all the known raid types), then I would vote that those get stored as some generic string field (instead of an enum of known values). I'm not sure that was the best answer to your question. If you have any specific cases you want to bring up, we can also talk about them

eminence / procfs

Mdstat #293