rr-debugger / rr

Record and Replay Framework
http://rr-project.org/
Other
9.13k stars 583 forks source link

rr crashes while replaying #1497

Open tbsaunde opened 9 years ago

tbsaunde commented 9 years ago

str checkout and build rr at commit 6915ac0 then record firefox startup just until the first page is shown is more than enough. run rr replay set a breakpoint on do_main run continue in gdb observe after js pretty printers say they have been loaded gdb prints connection closed, and the rr process is now a zombie.

tbsaunde commented 9 years ago

So, I didn't look closely enough at dmesgg. Turns out this is a bug in linux introduced sometime before 4.0.2 and not yet fixed in master. So feel free to close if your not interested in trying to work around it whatever exactly it is. The full dmesg output for anyone interested is:

[ 490.073615] ------------[ cut here ]------------ [ 490.073636] kernel BUG at mm/memory.c:3137! [ 490.073648] invalid opcode: 0000 [#1] SMP [ 490.073665] Modules linked in: ctr ccm bnep cpufreq_conservative cpufreq_powersave cpufreq_userspace cpufreq_stats binfmt_misc uinput nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc loop parport_pc ppdev lp parport x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi arc4 iwldvm coretemp kvm_intel kvm snd_hda_codec_hdmi mac80211 i915 snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_controller btusb btbcm snd_hda_codec btintel iwlwifi uvcvideo joydev bluetooth snd_hda_core snd_hwdep snd_pcm iTCO_wdt drm_kms_helper iTCO_vendor_support crct10dif_pclmul crc32_pclmul cfg80211 drm psmouse videobuf2_vmalloc thinkpad_acpi videobuf2_memops videobuf2_core ghash_clmulni_intel v4l2_common evdev snd_timer serio_raw nvram videodev aesni_intel aes_x86_64 lrw gf128mul glue_helper [ 490.073980] snd media ablk_helper pcspkr cryptd tpm_tis battery wmi tpm shpchp ac soundcore rfkill i2c_i801 i2c_algo_bit i2c_core processor video lpc_ich mfd_core button mei_me mei ext4 crc16 mbcache jbd2 dm_mod sg sd_mod crc32c_intel ahci libahci e1000e libata ehci_pci xhci_pci ehci_hcd sdhci_pci xhci_hcd scsi_mod sdhci mmc_core ptp usbcore pps_core thermal usb_common thermal_sys [last unloaded: speakup] [ 490.074168] CPU: 2 PID: 10737 Comm: rr Tainted: G C 4.1.0-rc5+ #1 [ 490.074185] Hardware name: LENOVO 232039U/232039U, BIOS G2ET95WW (2.55 ) 07/09/2013 [ 490.074202] task: ffff8804098d0290 ti: ffff8804098e4000 task.ti: ffff8804098e4000 [ 490.074219] RIP: 0010:[] [] handle_mm_fault+0x115f/0x1640 [ 490.074244] RSP: 0018:ffff8804098e7bc8 EFLAGS: 00010246 [ 490.074258] RAX: 0000000000000100 RBX: 0000000000000000 RCX: 0000000000000120 [ 490.074274] RDX: ffff8803f31cc7c0 RSI: 00003ffffffff000 RDI: 00000003f31cc067 [ 490.074290] RBP: 00002aaabccf8000 R08: 00000003d1ccc120 R09: 000000000001e6a0 [ 490.074306] R10: ffffffff8172ce12 R11: 0000000000000120 R12: ffff8803efbaa748 [ 490.074321] R13: 0000000000000000 R14: ffff88002e311f30 R15: ffff8800cabbb800 [ 490.074338] FS: 00007f983585a740(0000) GS:ffff88041e280000(0000) knlGS:0000000000000000 [ 490.074356] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 490.074369] CR2: 000000000357f1b0 CR3: 00000003f168a000 CR4: 00000000001407e0 [ 490.074385] Stack: [ 490.074391] 0000000000000100 0000000000000040 ffff88041e5eeb00 ffffea0000000001 [ 490.074415] 0000000000000100 ffff8800000007c0 00000003d1ccc120 0000000000000000 [ 490.074438] ffff88041e5ee210 ffff88040000001f ffff88041e5eeb08 0000000000000000 [ 490.074462] Call Trace: [ 490.074472] [] ? follow_page_pte+0x2b8/0x320 [ 490.074486] [] ? get_user_pages+0x174/0x600 [ 490.074504] [] ? radix_tree_lookup_slot+0x10/0x30 [ 490.074519] [] ? __access_remote_vm+0xde/0x2e0 [ 490.074536] [] ? mem_rw.isra.14+0xad/0x180 [ 490.074552] [] ? set_next_entity+0x6a/0x480 [ 490.074568] [] ? vfs_write+0x23/0xf0 [ 490.074582] [] ? __sb_start_write+0x45/0x100 [ 490.074598] [] ? security_file_permission+0x21/0xa0 [ 490.074614] [] ? vfs_write+0xa4/0x1b0 [ 490.074628] [] ? SyS_pwrite64+0x6b/0xa0 [ 490.074643] [] ? system_call_fastpath+0x16/0x75 [ 490.074658] Code: 24 90 00 00 00 48 89 8c 24 98 00 00 00 48 8d 74 24 68 48 89 54 24 70 89 44 24 68 49 8b 84 24 90 00 00 00 ff 50 18 e9 bb fa ff ff <0f> 0b 8d 50 e2 83 fa 01 0f 86 4b 03 00 00 83 f8 1d c7 44 24 30 [ 490.074859] RIP [] handle_mm_fault+0x115f/0x1640 [ 490.074876] RSP [ 490.074899] ---[ end trace c03716d799219ad5 ]---

tbsaunde commented 9 years ago

KERNEL BUG 99101 FWIW

rocallahan commented 9 years ago

Zounds!

If you run the rr test suite, does the bug occur? If so, that'd give you a smaller testcase for the kernel folks.