intel / pcm

Intel® Performance Counter Monitor (Intel® PCM)
BSD 3-Clause "New" or "Revised" License
2.83k stars 477 forks source link

Cannot build pcm-sensor-server for macOS #612

Closed MatteoBax closed 11 months ago

MatteoBax commented 1 year ago

Hi, if I try to compile pcm-sensor-server by running the following commands inside the build folder:

cmake ..  && make -j8 pcm-sensor-server

i receive:

make: *** No rule to make target `pcm-sensor-server'.  Stop.

Is pcm-sensor-server supported for macOS?

opcm commented 1 year ago

no, it is not. Patches welcome..

gogohaja commented 1 year ago

I found that building pcm-sensor-server on Mac OS was excluded in src/CMakeLists.txt. I also want to build pcm-sensor-server on Mac.

opcm commented 12 months ago

@gogohaja @MatteoBax could you try this branch? https://github.com/intel/pcm/tree/opcm-patch-pcm-sensor-server-osx If it works we can include it into the mainline

MatteoBax commented 11 months ago

Infinite loop of:

WARNING: Core 0 IA32_PERFEVTSEL0_ADDR is not zeroed 18446744073709551615
Warning: PMU appears to be busy, do you want to reset it? (y/n)
gogohaja commented 11 months ago

'Segmentation Fault' error occurs due to unknown reasons.

sudo ./pcm-sensor-server -r
password:

===== Processor Information =====
Hybrid processor: No
IBRS and IBPB support: Yes
STIBP Support: Yes
Specifications Arch Cap Support: Yes
Maximum CPUID level: 22
CPU Model Number: 158
Number of physical cores: 1
Number of logical cores: 12
Number of online logical cores: 12
Threads per physical core (logical cores): 8
Number of sockets: 4
Physical cores per socket: 0
Last level cache fragment per socket: 0
Core PMU (perfmon) version: 4
Number of core PMU typical (programmable) counters: 4
Typical (programmable) counter width: 48 bits
Number of core PMU fixed counters: 3
Fixed counter width: 48 bits
Nominal core frequency: 3700000000Hz
Enable IBRS in Kernel: No
Enable STIBP in Kernel: No
The processor is not susceptible to bad data cache loads.
The processor supports enhanced IBRS.
Package Thermal Specifications Power: 95W; Minimum package power: 0 watts; Package maximum power: 0W;

Info: 0 UBOX devices detected.
Socket 0: 0 PCU devices detected. 0 IIO device detected. 0 IRP unit detected. 0 CHA/CBO units detected. 0 MDF units detected. 0 CXL devices detected.
Socket 1: 0 PCU devices detected. 0 IIO device detected. 0 IRP unit detected. 0 CHA/CBO units detected. 0 MDF units detected. 0 CXL devices detected.
Socket 2: 0 PCU devices detected. 0 IIO device detected. 0 IRP unit detected. 0 CHA/CBO units detected. 0 MDF units detected. 0 CXL devices detected.
Socket 3: 0 PCU devices detected. 0 IIO device detected. 0 IRP unit detected. 0 CHA/CBO units detected. 0 MDF units detected. 0 CXL devices detected.

WARNING: Custom counter 0 is in use. MSR_PERF_GLOBAL_INUSE on core 0: 0x800000070000000f
WARNING: Custom counter 1 is in use. MSR_PERF_GLOBAL_INUSE on core 0: 0x800000070000000f
WARNING: Custom counter 2 is in use. MSR_PERF_GLOBAL_INUSE on core 0: 0x800000070000000f
WARNING: Custom counter 3 is in use. MSR_PERF_GLOBAL_INUSE on core 0: 0x800000070000000f
WARNING: Core 0 IA32_PERFEVTSEL0_ADDR is not set to 0. 4399313
    Zeroed PMU registers
Start a regular HTTP server at http://localhost:9738/
[1] 43000 Segmentation Error sudo ./pcm-sensor-server -r
opcm commented 11 months ago

Infinite loop of:

WARNING: Core 0 IA32_PERFEVTSEL0_ADDR is not zeroed 18446744073709551615
Warning: PMU appears to be busy, do you want to reset it? (y/n)

did you run into this issue (signing/sip): https://github.com/intel/pcm/issues/608#issuecomment-1822965185

opcm commented 11 months ago

'Segmentation Fault' error occurs due to unknown reasons.

sudo ./pcm-sensor-server -r
password:

===== Processor Information =====
Hybrid processor: No
IBRS and IBPB support: Yes
STIBP Support: Yes
Specifications Arch Cap Support: Yes
Maximum CPUID level: 22
CPU Model Number: 158
Number of physical cores: 1
Number of logical cores: 12
Number of online logical cores: 12
Threads per physical core (logical cores): 8
Number of sockets: 4
Physical cores per socket: 0
Last level cache fragment per socket: 0
Core PMU (perfmon) version: 4
Number of core PMU typical (programmable) counters: 4
Typical (programmable) counter width: 48 bits
Number of core PMU fixed counters: 3
Fixed counter width: 48 bits
Nominal core frequency: 3700000000Hz
Enable IBRS in Kernel: No
Enable STIBP in Kernel: No
The processor is not susceptible to bad data cache loads.
The processor supports enhanced IBRS.
Package Thermal Specifications Power: 95W; Minimum package power: 0 watts; Package maximum power: 0W;

Info: 0 UBOX devices detected.
Socket 0: 0 PCU devices detected. 0 IIO device detected. 0 IRP unit detected. 0 CHA/CBO units detected. 0 MDF units detected. 0 CXL devices detected.
Socket 1: 0 PCU devices detected. 0 IIO device detected. 0 IRP unit detected. 0 CHA/CBO units detected. 0 MDF units detected. 0 CXL devices detected.
Socket 2: 0 PCU devices detected. 0 IIO device detected. 0 IRP unit detected. 0 CHA/CBO units detected. 0 MDF units detected. 0 CXL devices detected.
Socket 3: 0 PCU devices detected. 0 IIO device detected. 0 IRP unit detected. 0 CHA/CBO units detected. 0 MDF units detected. 0 CXL devices detected.

WARNING: Custom counter 0 is in use. MSR_PERF_GLOBAL_INUSE on core 0: 0x800000070000000f
WARNING: Custom counter 1 is in use. MSR_PERF_GLOBAL_INUSE on core 0: 0x800000070000000f
WARNING: Custom counter 2 is in use. MSR_PERF_GLOBAL_INUSE on core 0: 0x800000070000000f
WARNING: Custom counter 3 is in use. MSR_PERF_GLOBAL_INUSE on core 0: 0x800000070000000f
WARNING: Core 0 IA32_PERFEVTSEL0_ADDR is not set to 0. 4399313
    Zeroed PMU registers
Start a regular HTTP server at http://localhost:9738/
[1] 43000 Segmentation Error sudo ./pcm-sensor-server -r

thanks for testing. Could you please run it in gdb and provide the callstack of the crash?

MatteoBax commented 11 months ago

'Segmentation Fault' error occurs due to unknown reasons.

sudo ./pcm-sensor-server -r
password:

===== Processor Information =====
Hybrid processor: No
IBRS and IBPB support: Yes
STIBP Support: Yes
Specifications Arch Cap Support: Yes
Maximum CPUID level: 22
CPU Model Number: 158
Number of physical cores: 1
Number of logical cores: 12
Number of online logical cores: 12
Threads per physical core (logical cores): 8
Number of sockets: 4
Physical cores per socket: 0
Last level cache fragment per socket: 0
Core PMU (perfmon) version: 4
Number of core PMU typical (programmable) counters: 4
Typical (programmable) counter width: 48 bits
Number of core PMU fixed counters: 3
Fixed counter width: 48 bits
Nominal core frequency: 3700000000Hz
Enable IBRS in Kernel: No
Enable STIBP in Kernel: No
The processor is not susceptible to bad data cache loads.
The processor supports enhanced IBRS.
Package Thermal Specifications Power: 95W; Minimum package power: 0 watts; Package maximum power: 0W;

Info: 0 UBOX devices detected.
Socket 0: 0 PCU devices detected. 0 IIO device detected. 0 IRP unit detected. 0 CHA/CBO units detected. 0 MDF units detected. 0 CXL devices detected.
Socket 1: 0 PCU devices detected. 0 IIO device detected. 0 IRP unit detected. 0 CHA/CBO units detected. 0 MDF units detected. 0 CXL devices detected.
Socket 2: 0 PCU devices detected. 0 IIO device detected. 0 IRP unit detected. 0 CHA/CBO units detected. 0 MDF units detected. 0 CXL devices detected.
Socket 3: 0 PCU devices detected. 0 IIO device detected. 0 IRP unit detected. 0 CHA/CBO units detected. 0 MDF units detected. 0 CXL devices detected.

WARNING: Custom counter 0 is in use. MSR_PERF_GLOBAL_INUSE on core 0: 0x800000070000000f
WARNING: Custom counter 1 is in use. MSR_PERF_GLOBAL_INUSE on core 0: 0x800000070000000f
WARNING: Custom counter 2 is in use. MSR_PERF_GLOBAL_INUSE on core 0: 0x800000070000000f
WARNING: Custom counter 3 is in use. MSR_PERF_GLOBAL_INUSE on core 0: 0x800000070000000f
WARNING: Core 0 IA32_PERFEVTSEL0_ADDR is not set to 0. 4399313
    Zeroed PMU registers
Start a regular HTTP server at http://localhost:9738/
[1] 43000 Segmentation Error sudo ./pcm-sensor-server -r

Me too

MatteoBax commented 11 months ago

Callstack of the crash:

Warning: PMU appears to be busy, do you want to reset it? (y/n)
y
 Zeroed PMU registers
Starting plain HTTP server on http://localhost:9738/
Process 939 stopped
* thread #19, stop reason = EXC_BAD_ACCESS (code=1, address=0x8)
    frame #0: 0x000000010005dc15 pcm-sensor-server`pcm::BasicCounterState::readAndAggregate(std::__1::shared_ptr<pcm::SafeMsrHandle>) + 165
pcm-sensor-server`pcm::BasicCounterState::readAndAggregate:
->  0x10005dc15 <+165>: movq   0x8(%rax), %rax
    0x10005dc19 <+169>: testq  %rax, %rax
    0x10005dc1c <+172>: je     0x10005ead2               ; <+3938>
    0x10005dc22 <+178>: movq   %rsi, %r15
  thread #20, stop reason = EXC_BAD_ACCESS (code=1, address=0x8)
    frame #0: 0x000000010005dc15 pcm-sensor-server`pcm::BasicCounterState::readAndAggregate(std::__1::shared_ptr<pcm::SafeMsrHandle>) + 165
pcm-sensor-server`pcm::BasicCounterState::readAndAggregate:
->  0x10005dc15 <+165>: movq   0x8(%rax), %rax
    0x10005dc19 <+169>: testq  %rax, %rax
    0x10005dc1c <+172>: je     0x10005ead2               ; <+3938>
    0x10005dc22 <+178>: movq   %rsi, %r15
  thread #21, stop reason = EXC_BAD_ACCESS (code=1, address=0x8)
    frame #0: 0x000000010005dc15 pcm-sensor-server`pcm::BasicCounterState::readAndAggregate(std::__1::shared_ptr<pcm::SafeMsrHandle>) + 165
pcm-sensor-server`pcm::BasicCounterState::readAndAggregate:
->  0x10005dc15 <+165>: movq   0x8(%rax), %rax
    0x10005dc19 <+169>: testq  %rax, %rax
    0x10005dc1c <+172>: je     0x10005ead2               ; <+3938>
    0x10005dc22 <+178>: movq   %rsi, %r15
  thread #22, stop reason = EXC_BAD_ACCESS (code=1, address=0x8)
    frame #0: 0x000000010005dc15 pcm-sensor-server`pcm::BasicCounterState::readAndAggregate(std::__1::shared_ptr<pcm::SafeMsrHandle>) + 165
pcm-sensor-server`pcm::BasicCounterState::readAndAggregate:
->  0x10005dc15 <+165>: movq   0x8(%rax), %rax
    0x10005dc19 <+169>: testq  %rax, %rax
    0x10005dc1c <+172>: je     0x10005ead2               ; <+3938>
    0x10005dc22 <+178>: movq   %rsi, %r15
  thread #23, stop reason = EXC_BAD_ACCESS (code=1, address=0x8)
    frame #0: 0x000000010005dc15 pcm-sensor-server`pcm::BasicCounterState::readAndAggregate(std::__1::shared_ptr<pcm::SafeMsrHandle>) + 165
pcm-sensor-server`pcm::BasicCounterState::readAndAggregate:
->  0x10005dc15 <+165>: movq   0x8(%rax), %rax
    0x10005dc19 <+169>: testq  %rax, %rax
    0x10005dc1c <+172>: je     0x10005ead2               ; <+3938>
    0x10005dc22 <+178>: movq   %rsi, %r15
  thread #24, stop reason = EXC_BAD_ACCESS (code=1, address=0x8)
    frame #0: 0x000000010005dc15 pcm-sensor-server`pcm::BasicCounterState::readAndAggregate(std::__1::shared_ptr<pcm::SafeMsrHandle>) + 165
pcm-sensor-server`pcm::BasicCounterState::readAndAggregate:
->  0x10005dc15 <+165>: movq   0x8(%rax), %rax
    0x10005dc19 <+169>: testq  %rax, %rax
    0x10005dc1c <+172>: je     0x10005ead2               ; <+3938>
    0x10005dc22 <+178>: movq   %rsi, %r15
  thread #25, stop reason = EXC_BAD_ACCESS (code=1, address=0x8)
    frame #0: 0x000000010005dc15 pcm-sensor-server`pcm::BasicCounterState::readAndAggregate(std::__1::shared_ptr<pcm::SafeMsrHandle>) + 165
pcm-sensor-server`pcm::BasicCounterState::readAndAggregate:
->  0x10005dc15 <+165>: movq   0x8(%rax), %rax
    0x10005dc19 <+169>: testq  %rax, %rax
    0x10005dc1c <+172>: je     0x10005ead2               ; <+3938>
    0x10005dc22 <+178>: movq   %rsi, %r15
Target 3: (pcm-sensor-server) stopped.
MatteoBax commented 11 months ago

Infinite loop of:

WARNING: Core 0 IA32_PERFEVTSEL0_ADDR is not zeroed 18446744073709551615
Warning: PMU appears to be busy, do you want to reset it? (y/n)

did you run into this issue (signing/sip): #608 (comment)

I turned off SIP and the loop no longer occurs.

opcm commented 11 months ago

Callstack of the crash:

Warning: PMU appears to be busy, do you want to reset it? (y/n)
y
 Zeroed PMU registers
Starting plain HTTP server on http://localhost:9738/
Process 939 stopped
* thread #19, stop reason = EXC_BAD_ACCESS (code=1, address=0x8)
    frame #0: 0x000000010005dc15 pcm-sensor-server`pcm::BasicCounterState::readAndAggregate(std::__1::shared_ptr<pcm::SafeMsrHandle>) + 165
pcm-sensor-server`pcm::BasicCounterState::readAndAggregate:
->  0x10005dc15 <+165>: movq   0x8(%rax), %rax
    0x10005dc19 <+169>: testq  %rax, %rax
    0x10005dc1c <+172>: je     0x10005ead2               ; <+3938>
    0x10005dc22 <+178>: movq   %rsi, %r15
  thread #20, stop reason = EXC_BAD_ACCESS (code=1, address=0x8)
    frame #0: 0x000000010005dc15 pcm-sensor-server`pcm::BasicCounterState::readAndAggregate(std::__1::shared_ptr<pcm::SafeMsrHandle>) + 165
pcm-sensor-server`pcm::BasicCounterState::readAndAggregate:
->  0x10005dc15 <+165>: movq   0x8(%rax), %rax
    0x10005dc19 <+169>: testq  %rax, %rax
    0x10005dc1c <+172>: je     0x10005ead2               ; <+3938>
    0x10005dc22 <+178>: movq   %rsi, %r15
  thread #21, stop reason = EXC_BAD_ACCESS (code=1, address=0x8)
    frame #0: 0x000000010005dc15 pcm-sensor-server`pcm::BasicCounterState::readAndAggregate(std::__1::shared_ptr<pcm::SafeMsrHandle>) + 165
pcm-sensor-server`pcm::BasicCounterState::readAndAggregate:
->  0x10005dc15 <+165>: movq   0x8(%rax), %rax
    0x10005dc19 <+169>: testq  %rax, %rax
    0x10005dc1c <+172>: je     0x10005ead2               ; <+3938>
    0x10005dc22 <+178>: movq   %rsi, %r15
  thread #22, stop reason = EXC_BAD_ACCESS (code=1, address=0x8)
    frame #0: 0x000000010005dc15 pcm-sensor-server`pcm::BasicCounterState::readAndAggregate(std::__1::shared_ptr<pcm::SafeMsrHandle>) + 165
pcm-sensor-server`pcm::BasicCounterState::readAndAggregate:
->  0x10005dc15 <+165>: movq   0x8(%rax), %rax
    0x10005dc19 <+169>: testq  %rax, %rax
    0x10005dc1c <+172>: je     0x10005ead2               ; <+3938>
    0x10005dc22 <+178>: movq   %rsi, %r15
  thread #23, stop reason = EXC_BAD_ACCESS (code=1, address=0x8)
    frame #0: 0x000000010005dc15 pcm-sensor-server`pcm::BasicCounterState::readAndAggregate(std::__1::shared_ptr<pcm::SafeMsrHandle>) + 165
pcm-sensor-server`pcm::BasicCounterState::readAndAggregate:
->  0x10005dc15 <+165>: movq   0x8(%rax), %rax
    0x10005dc19 <+169>: testq  %rax, %rax
    0x10005dc1c <+172>: je     0x10005ead2               ; <+3938>
    0x10005dc22 <+178>: movq   %rsi, %r15
  thread #24, stop reason = EXC_BAD_ACCESS (code=1, address=0x8)
    frame #0: 0x000000010005dc15 pcm-sensor-server`pcm::BasicCounterState::readAndAggregate(std::__1::shared_ptr<pcm::SafeMsrHandle>) + 165
pcm-sensor-server`pcm::BasicCounterState::readAndAggregate:
->  0x10005dc15 <+165>: movq   0x8(%rax), %rax
    0x10005dc19 <+169>: testq  %rax, %rax
    0x10005dc1c <+172>: je     0x10005ead2               ; <+3938>
    0x10005dc22 <+178>: movq   %rsi, %r15
  thread #25, stop reason = EXC_BAD_ACCESS (code=1, address=0x8)
    frame #0: 0x000000010005dc15 pcm-sensor-server`pcm::BasicCounterState::readAndAggregate(std::__1::shared_ptr<pcm::SafeMsrHandle>) + 165
pcm-sensor-server`pcm::BasicCounterState::readAndAggregate:
->  0x10005dc15 <+165>: movq   0x8(%rax), %rax
    0x10005dc19 <+169>: testq  %rax, %rax
    0x10005dc1c <+172>: je     0x10005ead2               ; <+3938>
    0x10005dc22 <+178>: movq   %rsi, %r15
Target 3: (pcm-sensor-server) stopped.

Thanks. Could you please type "bt" to see the full call stack of the crashing thread (with all frames)?

MatteoBax commented 11 months ago

Thanks. Could you please type "bt" to see the full call stack of the crashing thread (with all frames)?

Full call stack of the crashing thread:

* thread #19, stop reason = EXC_BAD_ACCESS (code=1, address=0x8)
  * frame #0: 0x000000010005dc15 pcm-sensor-server`pcm::BasicCounterState::readAndAggregate(std::__1::shared_ptr<pcm::SafeMsrHandle>) + 165
    frame #1: 0x0000000100096e18 pcm-sensor-server`pcm::Aggregator::dispatch(pcm::HyperThread*)::'lambda'(pcm::HyperThread*)::operator()(pcm::HyperThread*) const + 520
    frame #2: 0x0000000100096c06 pcm-sensor-server`std::__1::__packaged_task_func<std::__1::__bind<pcm::Aggregator::dispatch(pcm::HyperThread*)::'lambda'(pcm::HyperThread*)&, pcm::HyperThread*&>, std::__1::allocator<std::__1::__bind<pcm::Aggregator::dispatch(pcm::HyperThread*)::'lambda'(pcm::HyperThread*)&, pcm::HyperThread*&>>, pcm::CoreCounterState ()>::operator()() + 22
    frame #3: 0x0000000100097144 pcm-sensor-server`std::__1::packaged_task<pcm::CoreCounterState ()>::operator()() + 100
    frame #4: 0x0000000100094479 pcm-sensor-server`pcm::ThreadPool::execute(pcm::ThreadPool*) + 41
    frame #5: 0x00000001000381b0 pcm-sensor-server`void* std::__1::__thread_proxy[abi:v160006]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, std::__1::__bind<void (*)(pcm::ThreadPool*), pcm::ThreadPool*>>>(void*) + 48
    frame #6: 0x00007ff8120e9202 libsystem_pthread.dylib`_pthread_start + 99
    frame #7: 0x00007ff8120e4bab libsystem_pthread.dylib`thread_start + 15
ogbrugge commented 11 months ago

That is very clearly a null pointer access... Not sure how that is possible with all the shared_ptrs, will need to look into that.

ogbrugge commented 11 months ago

Would it be possible to run pcm-sensor-server with all debug output enabled? It's a documented command line switch (--help), set it to 5, redirect the output to a file and attach it here please?

MatteoBax commented 11 months ago

Would it be possible to run pcm-sensor-server with all debug output enabled? It's a documented command line switch (--help), set it to 5, redirect the output to a file and attach it here please?

debug.log

ogbrugge commented 11 months ago

Is this the full log before the errors come and that is it? Oh my...

MatteoBax commented 11 months ago

Is this the full log before the errors come and that is it? Oh my...

This is the entire log from the sudo ./pcm-sensor-server -D 5 -r command

ogbrugge commented 11 months ago

Thanks, this means the problem happens quite soon, if not immediately after startup, @opcm, I'm not sure what the cause is but this could be related to things not being properly initialized. What fix did you make for the other MacOSX problem?

opcm commented 11 months ago

Thanks, this means the problem happens quite soon, if not immediately after startup, @opcm, I'm not sure what the cause is but this could be related to things not being properly initialized. What fix did you make for the other MacOSX problem?

the other problem I remember did not require any fix in pcm: https://github.com/intel/pcm/issues/608#issuecomment-1823701578

opcm commented 11 months ago

I believe there is an issue with identification of CPU topology. @MatteoBax , would it be possible to run as root and set this environment variable: PCM_PRINT_TOPOLOGY=1 and run pcm? (Note: https://unix.stackexchange.com/questions/202383/how-to-pass-environment-variable-to-sudo-su)

MatteoBax commented 11 months ago

I believe there is an issue with identification of CPU topology. @MatteoBax , would it be possible to run as root and set this environment variable: PCM_PRINT_TOPOLOGY=1 and run pcm? (Note: https://unix.stackexchange.com/questions/202383/how-to-pass-environment-variable-to-sudo-su)

@opcm you are right, there was a problem with identifying the CPU topology:

=====  Processor topology  =====
OS_Processor    Thread_Id       Core_Id         Tile_Id         Package_Id      Core_Type   Native_CPU_Model
0               0               0               0               0               unknown         0               
1               0               0               0               0               unknown         0               
2               0               1               0               0               unknown         0               
3               0               1               0               0               unknown         0               
4               0               2               0               0               unknown         0               
5               0               2               0               0               unknown         0               
6               0               3               0               0               unknown         0               
7               0               3               0               0               unknown         0               
opcm commented 11 months ago

@MatteoBax the https://github.com/intel/pcm/tree/opcm-patch-pcm-sensor-server-osx branch has been updated with the new topology code for OSX. Could you please

  1. download the new version from https://github.com/intel/pcm/tree/opcm-patch-pcm-sensor-server-osx
  2. rebuild it (both user space and the MacMsr driver)
  3. load the new MacMsr driver version
  4. run the new version of pcm-sensor-server with the PCM_PRINT_TOPOLOGY=1 variable

does the new version crash? Please share the complete output from pcm-sensor-server with all warning and information messages.

MatteoBax commented 11 months ago

It crash anyway

=====  Processor information  =====
Hybrid processor         : no
IBRS and IBPB supported  : yes
STIBP supported          : yes
Spec arch caps supported : yes
Max CPUID level          : 22
CPU model number         : 142
Number of physical cores: 4
Number of logical cores: 8
Number of online logical cores: 8
Threads (logical cores) per physical core: 2
Num sockets: 1
Physical cores per socket: 4
Last level cache slices per socket: 4
Core PMU (perfmon) version: 4
Number of core PMU generic (programmable) counters: 4
Width of generic (programmable) counters: 48 bits
Number of core PMU fixed counters: 3
Width of fixed counters: 48 bits
Nominal core frequency: 2100000000 Hz
IBRS enabled in the kernel   : no
STIBP enabled in the kernel  : no
The processor is not susceptible to Rogue Data Cache Load: yes
The processor supports enhanced IBRS                     : yes

=====  Processor topology  =====
OS_Processor    Thread_Id       Core_Id         Module_Id       Tile_Id         Die_Id          Die_Group_Id    Package_Id      Core_Type       Native_CPU_Model
0               0               0               0               0               0               0               0               unknown         0               
1               1               0               0               0               0               0               0               unknown         0               
2               0               1               0               1               0               0               0               unknown         0               
3               1               1               0               1               0               0               0               unknown         0               
4               0               2               0               2               0               0               0               unknown         0               
5               1               2               0               2               0               0               0               unknown         0               
6               0               3               0               3               0               0               0               unknown         0               
7               1               3               0               3               0               0               0               unknown         0               
=====  Placement on packages  =====
Package Id.    Core Id.     Processors
0              0,1,2,3

=====  Core/Tile sharing  =====
Level      Processors
Core       (0,1)(2,3)(4,5)(6,7)
Tile / L2$ (0,1)(2,3)(4,5)(6,7)

Package thermal spec power: 15 Watt; Package minimum power: 0 Watt; Package maximum power: 0 Watt;

Info: 0 UBOX units detected.
Socket 0: 0 PCU units detected. 0 IIO units detected. 0 IRP units detected. 0 CHA/CBO units detected. 0 MDF units detected. 0 CXL units detected.

WARNING: Custom counter 0 is in use. MSR_PERF_GLOBAL_INUSE on core 0: 0x70000000f
WARNING: Custom counter 1 is in use. MSR_PERF_GLOBAL_INUSE on core 0: 0x70000000f
WARNING: Custom counter 2 is in use. MSR_PERF_GLOBAL_INUSE on core 0: 0x70000000f
WARNING: Custom counter 3 is in use. MSR_PERF_GLOBAL_INUSE on core 0: 0x70000000f
WARNING: Core 0 IA32_PERFEVTSEL0_ADDR is not zeroed 4399313
 Zeroed PMU registers
Starting plain HTTP server on http://localhost:9738/
opcm commented 11 months ago

thank you for testing @MatteoBax I found an issue which should directly relate to the crash. I pushed a fix into https://github.com/intel/pcm/tree/opcm-patch-pcm-sensor-server-osx branch. Could you please download it again and test?

MatteoBax commented 11 months ago

thank you for testing @MatteoBax I found an issue which should directly relate to the crash. I pushed a fix into https://github.com/intel/pcm/tree/opcm-patch-pcm-sensor-server-osx branch. Could you please download it again and test?

It works! A thousand thanks @opcm.

The only problem I have is that when I run /usr/local/sbin/pcm utility I get this:

dyld[3356]: Library not loaded: @rpath/libPcmMsr.dylib
  Referenced from: <E938B611-4170-3DD4-81F1-B72C7CEB5EF4> /usr/local/sbin/pcm
  Reason: no LC_RPATH's found

I've had this problem before. Should I open another issue?

ogbrugge commented 11 months ago

Woohoo!! Glad @opcm found the issue!

opcm commented 11 months ago

thank you for testing @MatteoBax I found an issue which should directly relate to the crash. I pushed a fix into https://github.com/intel/pcm/tree/opcm-patch-pcm-sensor-server-osx branch. Could you please download it again and test?

It works! A thousand thanks @opcm.

thank you for testing.

The only problem I have is that when I run /usr/local/sbin/pcm utility I get this:

dyld[3356]: Library not loaded: @rpath/libPcmMsr.dylib
  Referenced from: <E938B611-4170-3DD4-81F1-B72C7CEB5EF4> /usr/local/sbin/pcm
  Reason: no LC_RPATH's found

I've had this problem before. Should I open another issue?

Do you remember how you resolved that issue? Does copying libPcmMsr.dylib as described in https://github.com/intel/pcm/blob/f632877f022113bb5380f09ea553350971ef180d/doc/MAC_HOWTO.txt#L31 help?

You might also want to try setting DYLD_LIBRARY_PATH env variable to point to the directory with libPcmMsr.dylib: https://stackoverflow.com/questions/3146274/is-it-ok-to-use-dyld-library-path-on-mac-os-x-and-whats-the-dynamic-library-s

MatteoBax commented 11 months ago

If I run pcm from /usr/local/sbin the error is generated, while if I run it from pcm/build/bin the error is not generated. If I point the DYLD_LIBRARY_PATH environment variable to /usr/local/lib/, this problem does not occur. This is a problem with the pcm executable and not with pcm-sensor-server.

opcm commented 11 months ago

"point the DYLD_LIBRARY_PATH environment variable to /usr/local/lib/" looks like a solution. Thank you. Perhaps there should be a pull request (MAC HOW TO) documenting it

MatteoBax commented 11 months ago

"point the DYLD_LIBRARY_PATH environment variable to /usr/local/lib/" looks like a solution. Thank you. Perhaps there should be a pull request (MAC HOW TO) documenting it

Isn't it possible to specify the path that the DYLD_LIBRARY_PATH environment variable points to during building?

Regarding my previous statement, I stand corrected. All pcm executables fail to find the dynamic library when run from the /usr/local/sbin directory (i.e. the directory they are installed in).

opcm commented 11 months ago

"point the DYLD_LIBRARY_PATH environment variable to /usr/local/lib/" looks like a solution. Thank you. Perhaps there should be a pull request (MAC HOW TO) documenting it

Isn't it possible to specify the path that the DYLD_LIBRARY_PATH environment variable points to during building?

need to do some research if and how that is possible.

Regarding my previous statement, I stand corrected. All pcm executables fail to find the dynamic library when run from the /usr/local/sbin directory (i.e. the directory they are installed in).

good to know.

Please open a new issue.