Open rocurley opened 3 years ago
Is there output related to the "PMU" in dmesg startup on your system?
An i7-930 is really old. rr theoretically supports Nehalem architecture CPUs but it's possible kernel support for that hardware has regressed or something.
Here's what I've got for pmu:
$ dmesg | grep -i pmu
[ 0.275222] Performance Events: PEBS fmt1+, Nehalem events, 16-deep LBR, Intel PMU driver.
[ 0.275703] NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
[ 55.256868] PAX: PMU arbitration service v1.0.2 has been started.
[ 56.379721] socperf3_0: SocPerf Driver: detected 8 CPUs in lwpmudrv_Load
[ 56.379775] socperf3_0: PMU check enabled! F6.M1a.S5 index=-1
[ 57.386075] sep5_16: [load] [lwpmu_Load@6327]: Major number is 510
[ 57.386078] sep5_16: [load] [lwpmu_Load@6334]: Detected 8 total CPUs and 8 active CPUs.
[ 57.388719] sep5_16: [load] [lwpmu_Load@6596]: PMU collection driver v5.16.4 has been loaded.
[ 57.388722] sep5_16: [load] [lwpmu_Load@6606]: NMI will be used for handling PMU interrupts.
[ 57.388726] sep5_16: [load] [PMU_LIST_Initialize@603]: PMU check enabled! F6.M1a.S5 index=-1 drv_type=PUBLIC
[ 57.388727] sep5_16: [load] [PMU_LIST_Build_MSR_List@621]: No MSR list information detected!
[ 57.388729] sep5_16: [load] [PMU_LIST_Build_PCI_List@650]: No PCI list information detected!
[ 57.388731] sep5_16: [load] [PMU_LIST_Build_MMIO_List@687]: No MMIO list information detected!
[ 58.416251] vtsspp: PMU: fixed counters: 3, general counters: 4
Hi,
Got the same issue while trying to run rr on qemu
.
>rr record ./qemu-system-x86_64 <qemu args>
[FATAL /builddir/build/BUILD/rr-5.4.0/src/PerfCounters.cc:232:check_for_ioc_period_bug() errno: EINVAL] ioctl(PERF_EVENT_IOC_PERIOD) failed
=== Start rr backtrace:
rr(_ZN2rr13dump_rr_stackEv+0x5a)[0x556e0f1bfe2a]
rr(_ZN2rr15notifying_abortEv+0x4f)[0x556e0f1bfebf]
rr(+0x1e9549)[0x556e0f213549]
rr(+0xb9733)[0x556e0f0e3733]
rr(_ZN2rr12PerfCounters23default_ticks_semanticsEv+0x1e)[0x556e0f0e479e]
rr(_ZN2rr7SessionC1Ev+0x17a)[0x556e0f17c3da]
rr(_ZN2rr13RecordSessionC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERKSt6vectorIS6_SaIS6_EESD_RKNS_20DisableCPUIDFeaturesENS0_16SyscallBufferingEiNS_7BindCPUES8_PKNS_9TraceUuidEbb+0x65)[0x556e0f0f9f55]
rr(_ZN2rr13RecordSession6createERKSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS7_EESB_RKNS_20DisableCPUIDFeaturesENS0_16SyscallBufferingEhNS_7BindCPUERKS7_PKNS_9TraceUuidEbbb+0x6ac)[0x556e0f0f776c]
rr(_ZN2rr13RecordCommand3runERSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS7_EE+0x938)[0x556e0f0eaf78]
rr(main+0x138)[0x556e0f065b88]
/lib64/libc.so.6(__libc_start_main+0xd5)[0x7fe42ae0fb75]
rr(_start+0x2e)[0x556e0f06877e]
=== End rr backtrace
[1] 61629 IOT instruction (core dumped) rr record ./qemu-system-x86_64 [...]
>dmesg | grep -i pmu
[ 0.350049] Performance Events: PEBS fmt1+, Nehalem events, 16-deep LBR, Intel PMU driver.
[ 0.351421] NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
Realistically the only way this is going to get fixed is if someone with this hardware figures out what's going on in the kernel here. There are a number of branches in _perf_event_period
that could result in returning EINVAL, knowing which one is being taken is the first step. https://elixir.bootlin.com/linux/latest/source/kernel/events/core.c#L5448
Example backtrace:
I've tried this with the actual program I'm trying to debug,
ls
, andmake
.This happens both on current master (https://github.com/rr-debugger/rr/commit/3f5262f90e63a8ba4d5ed4156b806495830aae2f) and on the version from my package manager (5.3.0-2). I'm using Ubuntu 20.04, CPU is an i7-930. Happy to provide more details, but I'm not sure exactly what information would be helpful here.