tud-zih-energy / lo2s

Linux OTF2 Sampling - A Lightweight Node-Level Performance Monitoring Tool
https://tu-dresden.de/zih/forschung/projekte/lo2s?set_language=en
GNU General Public License v3.0
44 stars 13 forks source link

Bump file descriptor rlimit to hard rlimit by default #315

Closed cvonelm closed 5 months ago

cvonelm commented 5 months ago

The default soft limit for the number of open file descriptors per-process in most Linux systems is 1024. This results in crashes on most HPC systems I've used recently as even simple lo2s invocations will exceed this limit with all the per-core perf_event_open calls.

This microscopic soft limit is in place because select() only allows fd's < 1024. If you do not plan to use select() in your code, it is safe to bump the file descriptor limit from the soft limit to the hard limit.

We need to save and restore the old limit before we start the program under measurement however, as the resource limits are inherited by forked processes and we can not guarantee that the program under measurement does not do stupid stuff with select()

tilsche commented 5 months ago

All those calls need error handling.