DisplayLink / evdi

Extensible Virtual Display Interface
MIT License
715 stars 187 forks source link

Ubuntu 16.10: kernel BUG at mm/usercopy.c:75! #42

Closed nacc closed 8 years ago

nacc commented 8 years ago

Hello,

Thank you for providing this driver (generally). With this and the DL software, I was able to (generally) run two external monitors off Ubuntu 16.04. I have recently upgraded to 16.10 (about to hit beta) and had disabled evdi/DisplayLink as the older release did not yet support the newer 4.8 kernel in 16.10. I saw the release of 1.2, though, and updated. It leads to a kernel BUG and hard freeze of my system consistently, unfortunately, and I'm fairly sure it's in the evdi driver itself (or interaction with some other subsystem).

I am happy to debug, provide more logs, etc!

uname -a = Linux pitfall 4.8.0-14-generic #15-Ubuntu SMP Tue Sep 20 22:02:02 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Couple of independent questions from this bug:

1) Would you have any interest in perhaps getting this driver integrated into the Ubuntu kernel? It's not impossible to get out-of-tree, but open-source, drivers built with the Ubuntu kernel -- you'd get more test & integration exposure, etc.

2) If 1) were true, you might be able to get the other parts of DisplayLink packaged or maybe snapped (http://snapcraft.io/) up, so that it would be instantly accessible to end-users?

[caveat, I'm a Canonical employee and Ubuntu developer]

Sep 22 09:43:59 pitfall kernel: [  139.192812] usercopy: kernel memory exposure attempt detected from ffff906d3ec790c8 (kmalloc-96) (88 bytes)
Sep 22 09:43:59 pitfall kernel: [  139.192886] kernel BUG at /build/linux-S8e33G/linux-4.8.0/mm/usercopy.c:75!
Sep 22 09:43:59 pitfall kernel: [  139.192908] invalid opcode: 0000 [#1] SMP
Sep 22 09:43:59 pitfall kernel: [  139.192922] Modules linked in: evdi(OE) rfcomm xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack xt_tcpudp bridge stp llc iptable_filter ctr ccm bnep fuse binfmt_misc nls_utf8 nls_cp437 vfat fat arc4 hid_sensor_als hid_sensor_incl_3d hid_sensor_rotation hid_sensor_accel_3d hid_sensor_gyro_3d hid_sensor_magn_3d hid_sensor_trigger hid_sensor_iio_common intel_rapl industrialio_triggered_buffer kfifo_buf industrialio x86_pkg_temp_thermal intel_powerclamp coretemp iwlmvm mac80211 joydev hid_sensor_hub hid_multitouch hid_rmi usblp i2c_designware_platform i2c_designware_core snd_soc_skl snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp snd_hda_ext_core snd_soc_sst_match efi_pstore snd_soc_core kvm_intel kvm iwlwifi snd_compress
Sep 22 09:43:59 pitfall kernel: [  139.193206] evdi: [D] add_store:195 Increasing device count to 2
Sep 22 09:43:59 pitfall kernel: [  139.193226]  snd_hda_codec_hdmi irqbypass snd_pcm_dmaengine snd_hda_codec_realtek snd_hda_codec_generic intel_cstate intel_rapl_perf cfg80211 cdc_mbim snd_hda_intel snd_usb_audio snd_hda_codec cdc_wdm serio_raw snd_usbmidi_lib snd_seq_midi efivars snd_seq_midi_event snd_hda_core snd_hwdep snd_rawmidi snd_pcm snd_seq shpchp sg snd_seq_device snd_timer cdc_ncm uvcvideo usbnet videobuf2_vmalloc mii videobuf2_memops snd videobuf2_v4l2 videobuf2_core hci_uart soundcore videodev btusb idma64 virt_dma btrtl btbcm btqca btintel media bluetooth mei_me mei intel_lpss_pci ideapad_laptop processor_thermal_device intel_pch_thermal ucsi soc_button_array intel_lpss_acpi intel_vbtn intel_lpss battery sparse_keymap int340x_thermal_zone int3400_thermal rfkill mfd_core intel_soc_dts_iosf tpm_crb acpi_thermal_rel ac acpi_pad tpm_tis tpm_tis_core tpm evdev squashfs loop parport_pc ppdev lp parport efivarfs ip_tables x_tables autofs4 ext4 crc16 jbd2 fscrypto mbcache btrfs algif_skcipher af_alg uas usb_storage hid_generic usbhid dm_crypt dm_mod raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod sd_mod crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel i915 aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd xhci_pci i2c_algo_bit xhci_hcd sdhci_pci ahci sdhci libahci mmc_core drm_kms_helper usbcore libata syscopyarea sysfillrect sysimgblt usb_common scsi_mod fb_sys_fops drm fan thermal wmi i2c_hid hid video fjes button
Sep 22 09:43:59 pitfall kernel: [  139.193821] CPU: 3 PID: 5439 Comm: ThreadedDrmDevi Tainted: G           OE   4.8.0-14-generic #15-Ubuntu
Sep 22 09:43:59 pitfall kernel: [  139.193850] Hardware name: LENOVO 80MK/VIUU4, BIOS C6CN34WW 10/29/2015
Sep 22 09:43:59 pitfall kernel: [  139.193871] task: ffff906cf0a64280 task.stack: ffff906d195cc000
Sep 22 09:43:59 pitfall kernel: [  139.193895] RIP: 0010:[<ffffffffbd80d861>]  [<ffffffffbd80d861>] __check_object_size+0x101/0x3b5
Sep 22 09:43:59 pitfall kernel: [  139.193926] RSP: 0018:ffff906d195cfe08  EFLAGS: 00010286
Sep 22 09:43:59 pitfall kernel: [  139.193943] RAX: 000000000000005f RBX: ffff906d3ec790c8 RCX: 0000000000000006
Sep 22 09:43:59 pitfall kernel: [  139.193965] RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffff906d734cdb60
Sep 22 09:43:59 pitfall kernel: [  139.193991] RBP: 0000000000000058 R08: 000000000002f12c R09: 0000000000000005
Sep 22 09:43:59 pitfall kernel: [  139.194013] R10: ffff906d0c86d938 R11: 0000000000000426 R12: 0000000000000001
Sep 22 09:43:59 pitfall kernel: [  139.194035] R13: ffff906d3ec79120 R14: ffff906d3ec79098 R15: ffff906d3ec79080
Sep 22 09:43:59 pitfall kernel: [  139.194057] FS:  00007f50f0ff9700(0000) GS:ffff906d734c0000(0000) knlGS:0000000000000000
Sep 22 09:43:59 pitfall kernel: [  139.194081] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 22 09:43:59 pitfall kernel: [  139.194102] CR2: 00007f5148747ae8 CR3: 00000003f09ad000 CR4: 00000000003406e0
Sep 22 09:43:59 pitfall kernel: [  139.194125] Stack:
Sep 22 09:43:59 pitfall kernel: [  139.194133]  0000000000000000 ffff906d5942ac00 0000000000000058 0000000000000058
Sep 22 09:43:59 pitfall kernel: [  139.194160]  ffffffffc01ab04d ffff906d04d379d8 00007f50f0ff86a0 ffff906d3ec790c8
Sep 22 09:43:59 pitfall kernel: [  139.194188]  ffff906d5942acf0 0000000000000400 00007f50f0ff86a0 ffff906d5942ad08
Sep 22 09:43:59 pitfall kernel: [  139.194228] Call Trace:
Sep 22 09:43:59 pitfall kernel: [  139.194231] evdi: [D] evdi_crtc_init:312 drm_crtc_init: 0
Sep 22 09:43:59 pitfall kernel: [  139.194265]  [<ffffffffc01ab04d>] ? drm_read+0x13d/0x300 [drm]
Sep 22 09:43:59 pitfall kernel: [  139.194292]  [<ffffffffbd811c11>] ? vfs_read+0x91/0x130
Sep 22 09:43:59 pitfall kernel: [  139.194310]  [<ffffffffbd813072>] ? SyS_read+0x52/0xc0
Sep 22 09:43:59 pitfall kernel: [  139.194328]  [<ffffffffbdc39276>] ? entry_SYSCALL_64_fastpath+0x1e/0xa8
Sep 22 09:43:59 pitfall kernel: [  139.194352] Code: 54 02 00 00 49 c7 c0 90 82 01 be 48 c7 c2 af a7 ff bd 48 c7 c6 c4 e4 00 be 49 89 e9 48 89 d9 48 c7 c7 40 52 01 be e8 74 a0 f7 ff <0f> 0b 48 89 ee 48 89 df e8 02 52 fe ff 48 85 c0 49 89 c0 0f 85 
Sep 22 09:43:59 pitfall kernel: [  139.194488] RIP  [<ffffffffbd80d861>] __check_object_size+0x101/0x3b5
Sep 22 09:43:59 pitfall kernel: [  139.194517]  RSP <ffff906d195cfe08>
Sep 22 09:43:59 pitfall kernel: [  139.194661] evdi: [W] evdi_painter_crtc_state_notify:377 Painter does not exist!
Sep 22 09:43:59 pitfall kernel: [  139.194702] evdi: [D] evdi_detect:72 Painter is disconnected
Sep 22 09:43:59 pitfall kernel: [  139.194760] evdi evdi.1: No connectors reported connected with modes
Sep 22 09:43:59 pitfall kernel: [  139.194797] [drm] Cannot find any crtc or sizes - going 1024x768
Sep 22 09:43:59 pitfall kernel: [  139.196259] evdi evdi.1: fb2: evdidrmfb frame buffer device
Sep 22 09:43:59 pitfall kernel: [  139.196281] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
Sep 22 09:43:59 pitfall kernel: [  139.196302] [drm] No driver support for vblank timestamp query.
Sep 22 09:43:59 pitfall kernel: [  139.196321] [drm] evdi: evdi_stats_init
Sep 22 09:43:59 pitfall kernel: [  139.196338] [drm] Initialized evdi 1.2.55 20160912 on minor 2
Sep 22 09:43:59 pitfall kernel: [  139.214897] evdi: [D] evdi_detect:69 (dev=1) Painter is connected
Sep 22 09:43:59 pitfall kernel: [  139.214923] evdi: [D] evdi_painter_get_edid_copy:192 (dev=1) 00 ff ff
Sep 22 09:43:59 pitfall kernel: [  139.215144] ---[ end trace 65abd9366c92588f ]---
nacc commented 8 years ago

I should clarify, it does seem like "something" is happening with the latest kernel, if I give it long enough. I get put at a screen on my integrated display that looks sort of like the login background, but nothing further gets drawn on it. My mouse does seem to work, but I'm unable to interact with the screen in any meaningful way. I can drop to a terminal shell and look at logs, but nothing beyond the above BUG, repeated a few times, stuck out to me.

displaylink-mlukaszek commented 8 years ago

Hi @nacc: Does it mean that there is a problem with it that's not necessarily caused by evdi you think? I find it a little bit hard to imagine what's the behaviour - could you record it and upload the video somewhere? :)

To your questions from the first post. 1) Sure, having the driver available out of the box in Ubuntu has obvious advantages. It would greatly simplify installation of DL software, as it would replace the tedious recompilation process for our Ubuntu users, resolve issues with attempting to load unsigned DKMS-built kernel modules on systems with Secure Boot enabled, speed up upgrades, etc.

The devil is in the detail. We would need to rethink how can we still maintain enough control over the version that's available in Ubuntu kernels - as there could potentially be fixes in the module pushed here on GitHub, which our latest usermode app would need for a complete release - how would we make it happen? Also, you can't easily unload evdi at the moment if X use it, so you would still need to reboot after kernel is upgraded (which is not a huge problem, as you reboot anyway to switch).

2) it's already possible for the community to do - our license allows repackaging and redistribution (broadly speaking, details in the LICENSE file of the .run package). I've seen some distros picking it up already (there's RPM available that people use for Fedora, I've also seen Arch having a package). TBH, I'd be more comfortable leaving the packaging to people that know specifics of their favourite distro. That said, if you can help with proper packaging for Ubuntu that'd be brilliant.

nacc commented 8 years ago

@displaylink-mlukaszek thanks for the quick reply!

I think the trace indicates there is a bug in evdi with 4.8. 4.8 increased the usercopy hardening and exposure of kernel address to userspace. Does dlm or evdi rely on exposing kernel memory to userspace directly (perhaps via /dev/mem or /dev/kmem?)

There's a newer kernel in 16.10 -proposed again, so I'll wait to re-test again with that version and try and capture the full dmesg. I'm not sure I'll be able to get a video, but will look into my options for that next.

1) Yep, agreed there will be some issues. But I think it's worth considering, at least. Let me get in touch with our kernel folks and see what might be possible. The benefits you listed are my primary concern :)

2) Yep, it's worth doing it well, I think -- I'll put it on my list to investigate getting a .deb out there (even if only in my own PPA) or snap for the userspace side of things.

nacc commented 8 years ago

Ok, updated to 4.8.0.16.26 from 16.10-proposed, still no better. Well, let me clarify:

I ran the v1.2 installer script. And then I hooked up my external monitor to my Dell D3100 dock. It hung for a while, there seemed to be no response, but then, after dropping to the shell and back a few times, waiting a while, I was able to get back to a login screen. I have no idea what was going on during the time (yet). Attaching my dmesg and Xorg.0.log in case something obvious shows up to you. I do note that Ubuntu's Displays settings widget detects the external monitor now. However, it seems to think it's displaying something there when it's not... dmesg.txt Xorg.0.txt

displaylink-mlukaszek commented 8 years ago

OK, we can reproduce it as well with 4.8. There will be changes in evdi necessary to adjust to the recent changes.

nacc commented 8 years ago

@displaylink-mlukaszek Great thanks for following up on this! I'm happy to test whenever you have an update.

displaylink-mlukaszek commented 8 years ago

This has now been fixed in 1ec7873, and we correctly light up screens again. 👍

There is a fundamental problem with multi-screen use case in Ubuntu 16.10 beta though - we see windows leaving traces when moved on a second screen (and it does NOT matter how it is connected - could be directly to built-in GPU, the effect is identical). Perhaps it's just with the machine we tried, but looks really bad. Could you check if you see this as well, and if yes, try to escalate with your colleagues? Thanks!

daugustin commented 8 years ago

Is this commit safe to backport to existing 1.2.55 and kernels <4.8? If so, I would include this in my gentoo package.

displaylink-mlukaszek commented 8 years ago

@daugustin still pending wider testing, but I'd say it's safe; I expect we'll soon be releasing 1.2.1 with this fix in.

nacc commented 8 years ago

@displaylink-mlukaszek Excellent! I am not seeing the trace issues, and I'm now connected to a Dell P2415Q over a MOKiN USB-C adapter on my Lenovo Yoga 900. Were you testing with beta1 or beta2 (just released last night)? I'm on the latest release fully, so 4.8.0-17-generic now.

I will try and test with my Dell D3100 soon, but now that I have a working display that doesn't require the evdi module, my motivation is a bit lower (I promise to still follow-up on it, though, and I have mentioned this driver to the Ubuntu kernel team and hopefully it's now on their list :)

RussianNeuroMancer commented 8 years ago

@displaylink-mlukaszek

There is a fundamental problem with multi-screen use case in Ubuntu 16.10 beta though - we see windows leaving traces when moved on a second screen (and it does NOT matter how it is connected - could be directly to built-in GPU, the effect is identical). Perhaps it's just with the machine we tried, but looks really bad. Could you check if you see this as well, and if yes, try to escalate with your colleagues? Thanks!

Seeing exactly same issue on Ubuntu Gnome 16.04.1 on Helix 1gen with Think USB 3.0 Dock. Doesn't happen with 1.1.62, but happening with 1.2.58.

kurazsi commented 8 years ago

hello, i also have the same issue using Xubuntu 16.10 with i3 wm installed. Updated display link driver to the latest, performance greatly increased but as soon as i unplug the monitor the X server crashes, but i am still able to login via console ..

Maybe i should have used ubuntu 16.10 installed from server .... :) (then added xfce and i3)

displaylink-mlukaszek commented 8 years ago

The original problem with usercopy is fixed. Please raise other issues separately. Thanks.