intel / intel-cmt-cat

User space software for Intel(R) Resource Director Technology
http://www.intel.com/content/www/us/en/architecture-and-technology/resource-director-technology.html
Other
693 stars 183 forks source link

lib/cpuinfo: Increase the file descriptors limit to handle more CPUs #262

Closed babumoger closed 8 months ago

babumoger commented 8 months ago

The pqos tool fails with the following errors on systems with 300 or more CPU cores. $pqos NOTE: Mixed use of MSR and kernel interfaces to manage CAT or CMT & MBM may lead to unexpected behavior. ERROR: Could not open /sys/fs/resctrl directory ERROR: Failed to stop resctrl events ERROR: Failed to start all selected OS monitoring events Monitoring start error on core(s) 339, status 1

By default, the file descriptor limit is set to 1024 for a session. pqos monitor uses 3 descriptors for each CPU for perf monitoring. So, it runs out of limit(1024) on systems with 300 or more CPUs.

Fix the issue by detecting the number of CPUs in the system and increasing the descriptor limit using system call getrlimit and setrlimit respectively. Increase the limit to 4 times the number of CPUs to take care of open files limit.

Description

By default, the file descriptor limit is set to 1024 for a session. pqos monitor uses 3 descriptors for each CPU for perf monitoring. So, it runs out of limit(1024) on systems with 300 or more CPUs.

Fix the issue by detecting the number of CPUs in the system and increasing the descriptor limit using system call getrlimit and setrlimit respectively. Increase the limit to 4 times the number of CPUs to take care of open files limit.

Affected parts

Motivation and Context

How Has This Been Tested?

Tested on AMD system.

Types of changes

Checklist: