intel / numatop

NumaTOP is an observation tool for runtime memory locality characterization and analysis of processes and threads running on a NUMA system.
BSD 3-Clause "New" or "Revised" License
191 stars 47 forks source link

numatop can't start with AWS virtual machine (c3.8xlarge, E5-2680 v2) #29

Closed xly98 closed 1 month ago

xly98 commented 7 years ago

Hi,

I installed numatop on my local server and a virtual machine (Xen) of AWS. For my local server, it runs perfectly well. However, numatop had hard time to start with AWS virtual machine. It shows: "Fail to setup perf (probably permission denied)!" My local server uses 3.10.0-514.6.1.el7.x86_64 #1 SMP Wed Jan 18 13:06:36 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux (E5-2680 v2 @ 2.80GHz) The AWS virtual machine (type: c3.8xlarge) uses 3.10.0-514.el7.x86_64 #1 SMP Wed Oct 19 11:24:13 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux (E5-2680 v2 @ 2.80GHz). I believe that AWS has correct numa/cpu virtualization.

The log info is listed as follows: $sudo ./numatop -l 2 -f log.txt $cat log.txt

0: calibrate_cpuinfo: nsofclk = 0.3571, clkofsec = 2800000000 0: Detected 32 online CPUs 0: pf_profiling_setup: pf_event_open is failed for CPU0, COUNT0 0: os_profiling_start failed 0: perf thread is exiting. 0: perf_init: perf_profiling_start() failed 0: perf_init() is failed

Could you help me out please?

Thank you! Liyang

xly98 commented 7 years ago

It seems that numtop needs "hardware events" support.

Liyang

yaoj commented 7 years ago

Hi Liyang,

I don’t think numatop can work with kernel 3.10. Even if it can startup successfully but the reported RMA/LMA should be not correct. That’s because it needs better kernel support for offcore counters.

I highly recommend that you use the newer kernel. For example, could you try with 4.10?

Thanks Jin Yao

From: xly98 [mailto:notifications@github.com] Sent: Friday, May 12, 2017 2:41 AM To: 01org/numatop numatop@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: [01org/numatop] numatop can't start with AWS virtual machine (c3.8xlarge, E5-2680 v2) (#29)

Hi,

I installed numatop on my local server and a virtual machine (Xen) of AWS. For my local server, it runs perfectly well. However, numatop had hard time to start with AWS virtual machine. It shows: "Fail to setup perf (probably permission denied)!" My local server uses 3.10.0-514.6.1.el7.x86_64 #1https://github.com/01org/numatop/pull/1 SMP Wed Jan 18 13:06:36 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux (E5-2680 v2 @ 2.80GHz) The AWS virtual machine (type: c3.8xlarge) uses 3.10.0-514.el7.x86_64 #1https://github.com/01org/numatop/pull/1 SMP Wed Oct 19 11:24:13 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux (E5-2680 v2 @ 2.80GHz). I believe that AWS has correct numa/cpu virtualization.

Could you help out please?

Thank you! Liyang

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/01org/numatop/issues/29, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ADo4LqwafqKbrXnmSSFLqO9X-LWKLju9ks5r41Y7gaJpZM4NYZYT.

yaoj commented 7 years ago

Anyway please try with newer kernel. The 3.10 kernel is too old which doesn’t contain necessary performance counter supports.

Thanks Jin Yao

From: xly98 [mailto:notifications@github.com] Sent: Friday, May 12, 2017 5:47 AM To: 01org/numatop numatop@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: Re: [01org/numatop] numatop can't start with AWS virtual machine (c3.8xlarge, E5-2680 v2) (#29)

It seems that numtop needs "hardware events" support.

Liyang

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/01org/numatop/issues/29#issuecomment-300926113, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ADo4LleB-_N9sbmqDVxvZY_jlaJkZ9bGks5r44HAgaJpZM4NYZYT.

xly98 commented 7 years ago

Hi Jin,

Thank you so much for you reply! Is it possible to get RMA/LMA by numatop without "harward events" support? It seems that numastat can give "local node" and "other node" in VM.

Thank you, Liyang

yaoj commented 7 years ago

Hi Liyang,

Numastat doesn’t use the hardware performance counters, while numatop uses the hardware counters to get the numa locality information. So numatop can provide more precise data.

But the issue of numatop is it highly depends on the kernel perf supporting. For example, the offcore counters (numatop uses offcore counters to get RMA/LMA) is not supported in 3.x kernel. And for new platforms, the linux perf should be continuously added with the counters supporting. That’s why I suggest you to use the latest upstream kernel to avoid many troubles.

Thanks Jin Yao

From: xly98 [mailto:notifications@github.com] Sent: Friday, May 12, 2017 6:03 AM To: 01org/numatop numatop@noreply.github.com Cc: Jin, Yao yao.jin@intel.com; Comment comment@noreply.github.com Subject: Re: [01org/numatop] numatop can't start with AWS virtual machine (c3.8xlarge, E5-2680 v2) (#29)

Hi Jin,

Thank you so much for you reply! Is it possible to get RMA/LMA by numatop without "harward events" support? It seems that numastat can give "local node" and "other node" in VM.

Thank you, Liyang

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/01org/numatop/issues/29#issuecomment-300929431, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ADo4LstzLjKKDY5-tAQa3UmIcv2RvES3ks5r44V8gaJpZM4NYZYT.

xly98 commented 7 years ago

Hi Jin,

I am looking at the RMA/LMA of amazon cloud virtual machine. The virtual machine is running based on XEN. I have to run numatop inside the virtual machine and numatop has hard time to start (now the kernel is 4.4.63-1.el7.elrepo.x86_64).

Thank you, Liyang

yaoj commented 7 years ago

Could you check which platform the amazon cloud vm runs on?

It’s likely that the required counters can’t be supported in VM environment with this kernel.

I give you an example to check if the platform supports the required counters in vm environment.

Suppose the platform is skylake:

Measure LMA on SKL

perf stat -e cpu/config=0x5301b7,config1=0x1f84000001/ -a -I1000

time counts unit events

 1.000762735             13,493      cpu/config=0x5301b7,config1=0x1f84000001/
 2.001422204              5,654      cpu/config=0x5301b7,config1=0x1f84000001/
 ……

Measure RMA on SKL

perf stat -e cpu/config=0x5301b7,config1=0x638000001/ -a -I1000

time counts unit events

 1.000768839                  0      cpu/config=0x5301b7,config1=0x638000001/
 2.001431967                  0      cpu/config=0x5301b7,config1=0x638000001/
 ……

For other platform event bits, please check the definitions in numatop source files: numatop/intel bdw.c nhm.c skl.c snb.c wsm.c

Thanks Jin Yao

From: xly98 [mailto:notifications@github.com] Sent: Friday, May 12, 2017 6:37 AM To: 01org/numatop numatop@noreply.github.com Cc: Jin, Yao yao.jin@intel.com; Comment comment@noreply.github.com Subject: Re: [01org/numatop] numatop can't start with AWS virtual machine (c3.8xlarge, E5-2680 v2) (#29)

Hi Jin,

I am looking at the RMA/LMA of amazon cloud virtual machine. The virtual machine is running under XEN where numatop has hard time to start (kernel 4.4.63-1.el7.elrepo.x86_64).

Thank you, Liyang

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/01org/numatop/issues/29#issuecomment-300935633, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ADo4LsReEYW89HYGH1o4gSzoppIP1uB0ks5r442mgaJpZM4NYZYT.

yaoj commented 7 years ago

E5-2680 is IVB-EP. So you could use the command lines as below:

LMA

perf stat -e cpu/config=0x5301bb,config1=0x600400001/ -a -I1000

RMA

perf stat -e cpu/config=0x5301b7,config1= 0x67f800001/ -a -I1000

If it doesn’t work, it means the counter is not supported in VM.

From: Jin, Yao Sent: Friday, May 12, 2017 8:47 AM To: '01org/numatop' reply@reply.github.com; 01org/numatop numatop@noreply.github.com Cc: Comment comment@noreply.github.com Subject: RE: [01org/numatop] numatop can't start with AWS virtual machine (c3.8xlarge, E5-2680 v2) (#29)

Could you check which platform the amazon cloud vm runs on?

It’s likely that the required counters can’t be supported in VM environment with this kernel.

I give you an example to check if the platform supports the required counters in vm environment.

Suppose the platform is skylake:

Measure LMA on SKL

perf stat -e cpu/config=0x5301b7,config1=0x1f84000001/ -a -I1000

time counts unit events

 1.000762735             13,493      cpu/config=0x5301b7,config1=0x1f84000001/
 2.001422204              5,654      cpu/config=0x5301b7,config1=0x1f84000001/
 ……

Measure RMA on SKL

perf stat -e cpu/config=0x5301b7,config1=0x638000001/ -a -I1000

time counts unit events

 1.000768839                  0      cpu/config=0x5301b7,config1=0x638000001/
 2.001431967                  0      cpu/config=0x5301b7,config1=0x638000001/
 ……

For other platform event bits, please check the definitions in numatop source files: numatop/intel bdw.c nhm.c skl.c snb.c wsm.c

Thanks Jin Yao

From: xly98 [mailto:notifications@github.com] Sent: Friday, May 12, 2017 6:37 AM To: 01org/numatop numatop@noreply.github.com<mailto:numatop@noreply.github.com> Cc: Jin, Yao yao.jin@intel.com<mailto:yao.jin@intel.com>; Comment comment@noreply.github.com<mailto:comment@noreply.github.com> Subject: Re: [01org/numatop] numatop can't start with AWS virtual machine (c3.8xlarge, E5-2680 v2) (#29)

Hi Jin,

I am looking at the RMA/LMA of amazon cloud virtual machine. The virtual machine is running under XEN where numatop has hard time to start (kernel 4.4.63-1.el7.elrepo.x86_64).

Thank you, Liyang

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/01org/numatop/issues/29#issuecomment-300935633, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ADo4LsReEYW89HYGH1o4gSzoppIP1uB0ks5r442mgaJpZM4NYZYT.

xly98 commented 7 years ago

Hi Jin,

Thank you so much for your reply. It seems the AWS VM ( c3.8xlarge, E5-2680 v2) doesn't support the counter. I really appreciate your help anyway.

Liyang

$ sudo perf stat -e cpu/config=0x5301b7,config1=0x67f800001/ -a -I1000 invalid or unsupported event: 'cpu/config=0x5301b7,config1=0x67f800001/'

ak-intel commented 1 month ago

You would need to contact AWS to enable perf counters.