The pqos tool fails with the following errors on systems with 300 or more CPU cores.
$pqos
NOTE: Mixed use of MSR and kernel interfaces to manage
CAT or CMT & MBM may lead to unexpected behavior.
ERROR: Could not open /sys/fs/resctrl directory
ERROR: Failed to stop resctrl events
ERROR: Failed to start all selected OS monitoring events Monitoring start error on core(s) 339, status 1
By default, the file descriptor limit is set to 1024 for a session. pqos monitor uses 3 descriptors for each CPU for perf monitoring. So, it runs out of limit(1024) on systems with 300 or more CPUs.
Fix the issue by detecting the number of CPUs in the system and increasing the descriptor limit using system call getrlimit and setrlimit respectively. Increase the limit to 4 times the number of CPUs to take care of open files limit.
Description
By default, the file descriptor limit is set to 1024 for a session. pqos
monitor uses 3 descriptors for each CPU for perf monitoring. So, it runs
out of limit(1024) on systems with 300 or more CPUs.
Fix the issue by detecting the number of CPUs in the system and increasing
the descriptor limit using system call getrlimit and setrlimit respectively.
Increase the limit to 4 times the number of CPUs to take care of open files
limit.
Affected parts
[x] library
[x] pqos utility
[ ] rdtset utility
[ ] App QoS
[ ] other: (please specify)
Motivation and Context
How Has This Been Tested?
Tested on AMD system.
Types of changes
[x ] Bug fix (non-breaking change which fixes an issue)
[ ] New feature (non-breaking change which adds functionality)
[ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
Checklist:
[x] My code follows the code style of this project.
[ ] My change requires a change to the documentation.
The pqos tool fails with the following errors on systems with 300 or more CPU cores. $pqos NOTE: Mixed use of MSR and kernel interfaces to manage CAT or CMT & MBM may lead to unexpected behavior. ERROR: Could not open /sys/fs/resctrl directory ERROR: Failed to stop resctrl events ERROR: Failed to start all selected OS monitoring events Monitoring start error on core(s) 339, status 1
By default, the file descriptor limit is set to 1024 for a session. pqos monitor uses 3 descriptors for each CPU for perf monitoring. So, it runs out of limit(1024) on systems with 300 or more CPUs.
Fix the issue by detecting the number of CPUs in the system and increasing the descriptor limit using system call getrlimit and setrlimit respectively. Increase the limit to 4 times the number of CPUs to take care of open files limit.
Description
By default, the file descriptor limit is set to 1024 for a session. pqos monitor uses 3 descriptors for each CPU for perf monitoring. So, it runs out of limit(1024) on systems with 300 or more CPUs.
Fix the issue by detecting the number of CPUs in the system and increasing the descriptor limit using system call getrlimit and setrlimit respectively. Increase the limit to 4 times the number of CPUs to take care of open files limit.
Affected parts
Motivation and Context
How Has This Been Tested?
Tested on AMD system.
Types of changes
Checklist: