open-mpi / ompi

Open MPI main development repository
https://www.open-mpi.org
Other
2.07k stars 844 forks source link

Improve documentation of the `coll tuned` component #12641

Open burlen opened 1 week ago

burlen commented 1 week ago

Is your feature request related to a problem? Please describe. The coll tuned component has sophisticated tuning capabilities yet these are scarcely documented. The thresholds used to select a collective algorithm are based on experiences on existing systems, which leads to the need for users to tune when running on new systems. Tuning may also be generally beneficial since thresholds are set based on average experience across a variety of systems, and may not be the best for an individual system. Lots of digging into the source code, issue tracker and trial and error were needed for to understand how use to use the tuning features.

Describe the solution you'd like A small amount of documentation in the user guide that summarizes key information can make tuning far easier for those new to it.

Describe alternatives you've considered The current lack of documentation makes this important feature difficult to use.

Additional context See also #8157, #7672, #12589, #12547, #12453 among others