flux-framework / flux-coral2

Plugins and services for Flux on CORAL2 systems
GNU Lesser General Public License v3.0
9 stars 7 forks source link

Allow more MDT flexibility in Rabbit lustre allocations #171

Open jameshcorbett opened 2 months ago

jameshcorbett commented 2 months ago

Problem: by default, Flux creates one MDT per rabbit for lustre file systems. However, directivebreakdown resources list some info which may indicate that fewer (or more) MDTs should be created.

@behlendorf said:

We really haven't done much testing with multiple rabbits but being able to control the number of MDTs and OSTs is going to be important at scale.

Further, creating one MDT per rabbit

[Is] going to be an issue at scale. Lustre has issues beyond 50'ish MDTs in a filesystems, it should work but performance will get worse as MDTs are added.

Flux should look at directivebreakdowns to see if they offer hints on how many MDTs to create.

jameshcorbett commented 2 months ago

Some documentation is here https://nearnodeflash.github.io/dev/guides/directive-breakdown/readme/