rasa / vmware-tools-patches

Patch and build VMware tools automatically
https://github.com/rasa/vmware-tools-patches/wiki
MIT License
1.19k stars 198 forks source link

vmhgfs kernel oops in page_cache_async_readahead for large files #43

Open rrva opened 9 years ago

rrva commented 9 years ago

When reading large files over hgfs mounts, I often get a kernel oops in page_cache_async_readahead

Once:

[   75.481918] BUG: unable to handle kernel paging request at 00000002317cc000
[   75.481949] IP: [<00000002317cc000>] 0x2317cc000
[   75.481966] PGD b6660067 PUD 0 
[   75.481978] Oops: 0010 [#1] PREEMPT SMP 
[   75.481992] Modules linked in: veth xt_addrtype xt_conntrack mousedev cfg80211 zram rfkill bridge stp llc lz4_compress coretemp iptable_filter ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack hwmon ip_tables x_tables kvm_intel kvm evdev ppdev vmwgfx mac_hid crct10dif_pclmul crc32_pclmul e1000 ghash_clmulni_intel ttm pcspkr aesni_intel aes_x86_64 lrw gf128mul psmouse drm_kms_helper glue_helper ablk_helper cryptd drm serio_raw intel_agp i2c_core vmw_vmci intel_gtt shpchp battery acpi_cpufreq processor parport_pc parport ac button sch_fq_codel vmxnet3 vmw_balloon vmhgfs(O) btrfs xor raid6_pq sr_mod cdrom ata_generic sd_mod pata_acpi atkbd libps2 crc32c_intel ata_piix mptspi libata scsi_transport_spi mptscsih scsi_mod mptbase floppy i8042
[   75.482272]  serio
[   75.482277] CPU: 3 PID: 2575 Comm: lzop Tainted: G           O    4.0.1-1-ARCH #1
[   75.482298] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/20/2014
[   75.482326] task: ffff88022c5532a0 ti: ffff8800b6668000 task.ti: ffff8800b6668000
[   75.482346] RIP: 0010:[<00000002317cc000>]  [<00000002317cc000>] 0x2317cc000
[   75.482368] RSP: 0018:ffff8800b666bd00  EFLAGS: 00010206
[   75.482382] RAX: ffff880230fe4828 RBX: ffff8800b6a687a0 RCX: ffffea0008be5700
[   75.482401] RDX: 00000002317cc000 RSI: 0000000000000002 RDI: 0000000000001000
[   75.482420] RBP: ffff8800b666bd38 R08: 0000000000000001 R09: 000000000000002b
[   75.482439] R10: ffff8800b666bca8 R11: 3b5ffc5000203436 R12: ffff88022b30c150
[   75.482457] R13: 0000000000000001 R14: 000000000000002b R15: ffff8800b6a68700
[   75.482476] FS:  00007f0690cfe700(0000) GS:ffff88023fd80000(0000) knlGS:0000000000000000
[   75.482498] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   75.482513] CR2: 00000002317cc000 CR3: 00000000b663b000 CR4: 00000000001407e0
[   75.482564] Stack:
[   75.482571]  ffffffff81168e69 ffff8800b6a68700 ffff88022b30c150 ffff8800b6a68700
[   75.482595]  ffff8800b666be40 0000000000000001 ffffea0008be5700 ffff8800b666bdf8
[   75.482618]  ffffffff8115d08f 0000000000000001 000000000000002c 00000000359ca300
[   75.482642] Call Trace:
[   75.482653]  [<ffffffff81168e69>] ? page_cache_async_readahead+0x59/0xa0
[   75.482672]  [<ffffffff8115d08f>] generic_file_read_iter+0x34f/0x5f0
[   75.483182]  [<ffffffffa012e220>] HgfsFileRead+0x30/0x40 [vmhgfs]
[   75.483727]  [<ffffffff811d1b8b>] new_sync_read+0x8b/0xd0
[   75.484277]  [<ffffffff811d2dd8>] __vfs_read+0x18/0x50
[   75.484807]  [<ffffffff811d2e9a>] vfs_read+0x8a/0x140
[   75.485324]  [<ffffffff811d2fa9>] SyS_read+0x59/0xd0
[   75.485838]  [<ffffffff8156d949>] system_call_fastpath+0x12/0x17
[   75.486355] Code:  Bad RIP value.
[   75.486860] RIP  [<00000002317cc000>] 0x2317cc000
[   75.487355]  RSP <ffff8800b666bd00>
[   75.487841] CR2: 00000002317cc000
[   75.488407] ---[ end trace cfe4f533ad6ff804 ]---

Second time:

[   75.510389] BUG: unable to handle kernel paging request at 00000002317cc000
[   75.510844] IP: [<00000002317cc000>] 0x2317cc000
[   75.511324] PGD 22ac4c067 PUD 0 
[   75.511730] Oops: 0010 [#2] PREEMPT SMP 
[   75.512126] Modules linked in: veth xt_addrtype xt_conntrack mousedev cfg80211 zram rfkill bridge stp llc lz4_compress coretemp iptable_filter ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack hwmon ip_tables x_tables kvm_intel kvm evdev ppdev vmwgfx mac_hid crct10dif_pclmul crc32_pclmul e1000 ghash_clmulni_intel ttm pcspkr aesni_intel aes_x86_64 lrw gf128mul psmouse drm_kms_helper glue_helper ablk_helper cryptd drm serio_raw intel_agp i2c_core vmw_vmci intel_gtt shpchp battery acpi_cpufreq processor parport_pc parport ac button sch_fq_codel vmxnet3 vmw_balloon vmhgfs(O) btrfs xor raid6_pq sr_mod cdrom ata_generic sd_mod pata_acpi atkbd libps2 crc32c_intel ata_piix mptspi libata scsi_transport_spi mptscsih scsi_mod mptbase floppy i8042
[   75.515620]  serio
[   75.516084] CPU: 3 PID: 2580 Comm: cp Tainted: G      D    O    4.0.1-1-ARCH #1
[   75.516569] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/20/2014
[   75.517548] task: ffff8800b698a880 ti: ffff88022ac50000 task.ti: ffff88022ac50000
[   75.518061] RIP: 0010:[<00000002317cc000>]  [<00000002317cc000>] 0x2317cc000
[   75.518547] RSP: 0018:ffff88022ac53d00  EFLAGS: 00010206
[   75.519014] RAX: ffff880230fe4828 RBX: ffff8800b65b35a0 RCX: ffffea0008b06380
[   75.519485] RDX: 00000002317cc000 RSI: 0000000000000002 RDI: 0000000000001000
[   75.520048] RBP: ffff88022ac53d38 R08: 0000000000000008 R09: 0000000000000008
[   75.520553] R10: ffff88022ac53ca8 R11: 0000000000000246 R12: ffff88022b30da50
[   75.521015] R13: 0000000000000008 R14: 0000000000000008 R15: ffff8800b65b3500
[   75.521472] FS:  00007f4b8ec6c7a0(0000) GS:ffff88023fd80000(0000) knlGS:0000000000000000
[   75.521938] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   75.522402] CR2: 00000002317cc000 CR3: 000000022ac40000 CR4: 00000000001407e0
[   75.522971] Stack:
[   75.523441]  ffffffff81168e69 ffff8800b65b3500 ffff88022b30da50 ffff8800b65b3500
[   75.523967]  ffff88022ac53e40 0000000000000008 ffffea0008b06380 ffff88022ac53df8
[   75.524509]  ffffffff8115d08f ffff8800b65b3900 0000000000000010 000000002ac53e68
[   75.525127] Call Trace:
[   75.525790]  [<ffffffff81168e69>] ? page_cache_async_readahead+0x59/0xa0
[   75.526329]  [<ffffffff8115d08f>] generic_file_read_iter+0x34f/0x5f0
[   75.526889]  [<ffffffffa012cbc9>] ? HgfsDentryAgeReset+0x49/0x50 [vmhgfs]
[   75.527482]  [<ffffffffa012e220>] HgfsFileRead+0x30/0x40 [vmhgfs]
[   75.528072]  [<ffffffff811d1b8b>] new_sync_read+0x8b/0xd0
[   75.528647]  [<ffffffff811d2dd8>] __vfs_read+0x18/0x50
[   75.529221]  [<ffffffff811d2e9a>] vfs_read+0x8a/0x140
[   75.529743]  [<ffffffff811d2fa9>] SyS_read+0x59/0xd0
[   75.530254]  [<ffffffff8156d949>] system_call_fastpath+0x12/0x17
[   75.530755] Code:  Bad RIP value.
[   75.531267] RIP  [<00000002317cc000>] 0x2317cc000
[   75.531755]  RSP <ffff88022ac53d00>
[   75.532216] CR2: 00000002317cc000
[   75.534476] ---[ end trace cfe4f533ad6ff805 ]---
rrva commented 9 years ago

Linux version 4.0.1-1-ARCH (vagrant@localhost) (gcc version 4.9.2 20150304 (prerelease) (GCC)

JustArchi commented 9 years ago

Confirmed on latest Debian sid.

[ 6685.222705] general protection fault: 0000 [#2] SMP
[ 6685.222876] Modules linked in: vmw_vsock_vmci_transport vsock vmhgfs(O) iosf_mbi coretemp crct10dif_pclmul crc32_pclmul crc32c_intel snd_ens1371 snd_rawmidi snd_seq_device snd_ac97_codec ghash_clmulni_intel snd_pcm aesni_intel snd_timer snd aes_x86_64 lrw gf128mul hid_generic usbhid soundcore hid psmouse ppdev serio_raw glue_helper ac97_bus ablk_helper cryptd gameport e1000 sr_mod cdrom pcspkr vmw_balloon evdev uhci_hcd ehci_pci ehci_hcd sg usbcore usb_common parport_pc floppy 8250_fintek ata_generic battery parport acpi_cpufreq i2c_piix4 shpchp processor vmwgfx ttm drm_kms_helper drm ata_piix libata thermal_sys i2c_core vmw_vmci ac button fuse autofs4 ext4 crc16 mbcache jbd2 vmw_pvscsi vmxnet3 sd_mod mptspi scsi_transport_spi mptscsih scsi_mod mptbase
[ 6685.223628] CPU: 2 PID: 69675 Comm: unzip Tainted: G      D W  O    4.0.0-1-amd64 #1 Debian 4.0.2-1
[ 6685.223745] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/20/2014
[ 6685.223861] task: ffff8800ba9fe3d0 ti: ffff8800684d0000 task.ti: ffff8800684d0000
[ 6685.223961] RIP: 0010:[<ffffffff811602ee>]  [<ffffffff811602ee>] page_cache_async_readahead+0x4e/0xa0
[ 6685.224075] RSP: 0018:ffff8800684d3d68  EFLAGS: 00010202
[ 6685.224132] RAX: ffff8800ba8bec28 RBX: ffff88003783dfa0 RCX: ffffea0002d2ab40
[ 6685.224206] RDX: 41444f4d00317570 RSI: 0000000000000002 RDI: 7570633d5341494c
[ 6685.224274] RBP: 000000000000f60c R08: 000000000000f60c R09: 0000000000000002
[ 6685.224341] R10: ffff8800684d3d28 R11: 0000000000000246 R12: 0000000000000002
[ 6685.224410] R13: ffff88003783df00 R14: ffff8801717d5e98 R15: ffffea0002d2ab40
[ 6685.224479] FS:  00007f1c98cd9700(0000) GS:ffff880174640000(0000) knlGS:0000000000000000
[ 6685.224583] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 6685.224642] CR2: 0000000000e9d008 CR3: 00000000ba8e5000 CR4: 00000000000407e0
[ 6685.224738] Stack:
[ 6685.224782]  ffff8801717d5e98 ffff88003783df00 000000000000f60c ffff8800684d3e68
[ 6685.224887]  0000000000000002 ffffffff81154a79 ffffffff81adcac0 0000000000000000
[ 6685.224992]  ffff8801717d5d40 ffff8800684d3e90 ffff88003783dfa0 000000000000f60b
[ 6685.225096] Call Trace:
[ 6685.225141]  [<ffffffff81154a79>] ? generic_file_read_iter+0x359/0x5e0
[ 6685.225215]  [<ffffffff8118a991>] ? page_add_new_anon_rmap+0x71/0xa0
[ 6685.225279]  [<ffffffff81180963>] ? handle_mm_fault+0xd33/0x1640
[ 6685.225344]  [<ffffffff811c13d1>] ? new_sync_read+0x71/0xa0
[ 6685.225404]  [<ffffffff811c25c1>] ? vfs_read+0x81/0x130
[ 6685.225469]  [<ffffffff811c26b2>] ? SyS_read+0x42/0xb0
[ 6685.225527]  [<ffffffff8156418d>] ? system_call_fast_compare_end+0xc/0x11
[ 6685.225591] Code: 54 4d 89 cc 55 4c 89 c5 53 48 89 f3 f0 80 61 02 fb 48 8b 3f e8 f4 b2 08 00 48 8b 50 28 48 85 d2 74 1b 48 8b 78 30 be 02 00 00 00 <ff> d2 85 c0 74 1c 5b 5d 41 5c 41 5d 41 5e c3 0f 1f 00 48 8b 40
[ 6685.225834] RIP  [<ffffffff811602ee>] page_cache_async_readahead+0x4e/0xa0
[ 6685.225902]  RSP <ffff8800684d3d68>
[ 6685.226225] ---[ end trace 9f0dfc859688b8a6 ]---

4.0.2-1

rrva commented 9 years ago

@sl4ever: any ideas?

sl4ever commented 9 years ago

It's my fault. I assume bdi_setup_and_register() will initialize all infos, but it not.

rasa commented 9 years ago

Is this what's fixed in #51?