Closed paulo-raca closed 7 years ago
I have a very similar error when terminating gbridge (Which uses gb-netlink), except this one doesn't crash the whole system:
[ 258.642158] greybus: loading out-of-tree module taints kernel.
[ 258.642252] greybus: module verification failed: signature and/or required key missing - tainting kernel
[ 258.643688] Unable to find a compatible ARMv7 timer
[ 258.643690] Time-Sync @ 517291 Hz max ktime conversion +/- 15 seconds
[ 273.456141] greybus 1-svc: set power mode = 0
[ 273.456146] greybus 1-svc: power mode change failed on AP to switch link: -5
[ 278.131409] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 278.131558] IP: [< (null)>] (null)
[ 278.131641] PGD 0
[ 278.131684] Oops: 0010 [#1] SMP
[ 278.131738] Modules linked in: gb_netlink(OE) greybus(OE) rfcomm ccm arc4 hid_logitech_hidpp hid_logitech_dj hid_generic usbhid snd_hda_codec_hdmi hid_multitouch cmac bnep nls_iso8859_1 i2c_designware_platform i2c_designware_core dell_wmi dell_laptop dell_led dell_smbios dcdbas snd_soc_skl snd_hda_codec_realtek snd_hda_codec_generic snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp snd_hda_ext_core snd_soc_sst_match snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq intel_rapl x86_pkg_temp_thermal coretemp kvm_intel snd_seq_device snd_timer kvm ath10k_pci irqbypass crct10dif_pclmul ath10k_core snd crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw glue_helper ablk_helper ath cryptd
[ 278.133107] joydev input_leds serio_raw uvcvideo mac80211 soundcore videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core videodev media rtsx_pci_ms cfg80211 memstick btusb btrtl hci_uart btbcm intel_vbtn btqca soc_button_array btintel bluetooth intel_lpss_acpi int3403_thermal int3400_thermal acpi_thermal_rel intel_hid mei_me acpi_pad idma64 sparse_keymap virt_dma processor_thermal_device mei shpchp int340x_thermal_zone intel_pch_thermal intel_soc_dts_iosf mac_hid intel_lpss_pci intel_lpss acpi_als kfifo_buf industrialio parport_pc ppdev lp parport ip_tables x_tables autofs4 rtsx_pci_sdmmc i915 i2c_algo_bit psmouse drm_kms_helper syscopyarea sysfillrect sysimgblt nvme fb_sys_fops nvme_core drm rtsx_pci wmi i2c_hid hid pinctrl_sunrisepoint video pinctrl_intel fjes
[ 278.134433] CPU: 1 PID: 2186 Comm: gbridge Tainted: G OE 4.8.0-34-generic #36-Ubuntu
[ 278.134461] Hardware name: Dell Inc. XPS 13 9360/0JGD96, BIOS 1.0.7 09/13/2016
[ 278.134483] task: ffff9eb971772d00 task.stack: ffff9eb926124000
[ 278.134503] RIP: 0010:[<0000000000000000>] [< (null)>] (null)
[ 278.134527] RSP: 0018:ffff9eb926127a08 EFLAGS: 00010246
[ 278.134544] RAX: ffffffffc0c07000 RBX: ffff9eb926186800 RCX: 0000000000000000
[ 278.134565] RDX: ffff9eb97e590080 RSI: 0000000000000246 RDI: ffff9eb926186800
[ 278.134587] RBP: ffff9eb926127a40 R08: ffff9eb97e590080 R09: ffff9eb970bbce20
[ 278.134609] R10: 0000000000000014 R11: ffff9eb96d229a00 R12: ffff9eb96d349410
[ 278.134631] R13: ffff9eb96d349400 R14: ffffffffc0c9b050 R15: ffff9eb96d349410
[ 278.134653] FS: 00007f7f7aba22c0(0000) GS:ffff9eb97e480000(0000) knlGS:0000000000000000
[ 278.134678] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 278.134696] CR2: 0000000000000000 CR3: 000000026931c000 CR4: 00000000003406e0
[ 278.134718] Stack:
[ 278.134726] ffffffffc0c961ac ffff9eb926185000 ffff9eb96d349400 ffff9eb926185000
[ 278.134755] ffff9eb96d229a14 ffff9eb96e7f3a00 ffff9eb926185000 ffff9eb926127a80
[ 278.134783] ffffffffc0c9649b ffff9eb96d349438 ffff9eb926186800 ffff9eb926186800
[ 278.134810] Call Trace:
[ 278.134828] [<ffffffffc0c961ac>] ? gb_timesync_teardown+0x7c/0xd0 [greybus]
[ 278.134855] [<ffffffffc0c9649b>] gb_timesync_svc_remove+0xbb/0x1a0 [greybus]
[ 278.134881] [<ffffffffc0c93b04>] gb_svc_del+0x34/0x130 [greybus]
[ 278.134903] [<ffffffffc0c8be61>] gb_hd_del+0x21/0x70 [greybus]
[ 278.134923] [<ffffffffc0c0502e>] _gb_netlink_exit+0x1e/0x40 [gb_netlink]
[ 278.134945] [<ffffffffc0c05219>] gb_netlink_hd_reset+0x19/0x30 [gb_netlink]
[ 278.134972] [<ffffffff9d5b87bb>] genl_family_rcv_msg+0x1db/0x3c0
[ 278.134992] [<ffffffff9ceb8209>] ? update_cfs_shares+0xb9/0xf0
[ 278.135012] [<ffffffff9d5b89a0>] ? genl_family_rcv_msg+0x3c0/0x3c0
[ 278.135033] [<ffffffff9d5b8a27>] genl_rcv_msg+0x87/0xc0
[ 278.135051] [<ffffffff9d5b7f24>] netlink_rcv_skb+0xa4/0xc0
[ 278.135072] [<ffffffff9d5b85c8>] genl_rcv+0x28/0x40
[ 278.135091] [<ffffffff9d5b790c>] netlink_unicast+0x18c/0x220
[ 278.135110] [<ffffffff9d5b7c97>] netlink_sendmsg+0x2f7/0x3b0
[ 278.135129] [<ffffffff9d1cdc11>] ? aa_sock_msg_perm+0x61/0x150
[ 278.135150] [<ffffffff9d561a88>] sock_sendmsg+0x38/0x50
[ 278.135168] [<ffffffff9d562592>] ___sys_sendmsg+0x2c2/0x2d0
[ 278.135191] [<ffffffff9d0255ee>] ? mem_cgroup_commit_charge+0x7e/0x4f0
[ 278.135217] [<ffffffff9cfb43a6>] ? lru_cache_add_active_or_unevictable+0x36/0xb0
[ 278.135241] [<ffffffff9cfdc563>] ? handle_mm_fault+0xf73/0x13c0
[ 278.135260] [<ffffffff9d562ee4>] __sys_sendmsg+0x54/0x90
[ 278.135278] [<ffffffff9d562f32>] SyS_sendmsg+0x12/0x20
[ 278.135296] [<ffffffff9d69c2b6>] entry_SYSCALL_64_fastpath+0x1e/0xa8
[ 278.135316] Code: Bad RIP value.
[ 278.135332] RIP [< (null)>] (null)
[ 278.135349] RSP <ffff9eb926127a08>
[ 278.135362] CR2: 0000000000000000
[ 278.142491] ---[ end trace d74c80eb512d77d2 ]---
[ 301.141708] mce: [Hardware Error]: Machine check events logged
Oi Paulo, tudo bem?
First, I think you will have more chances of your questions being notice if you send them to the greybus-dev mailing list, greybus-dev.
Second, please note that the main greybus code is now in upstream kernel under drivers/staging/greybus. I do think that this code is already out of sync with it, specially regarding the timesync stuff that was already removed from the upstream. I do not know if @gregkh will continue to sync it with the one here.
having said that, You can workarround your issue if like to unblock yourself by something like this in timesync_platform.c file:
void gb_timesync_platform_unlock_bus(void) { if (arche_platform_change_state_cb) arche_platform_change_state_cb(ARCHE_PLATFORM_STATE_ACTIVE, NULL); }
note the if statement which is what you will need to add to avoid the crash you are seeing.
This should solve both situations.
Hope this could help.
Cheers
Thanks, @rmsilva.
I was under the impression that this repository was up to date. I'll try to upgrade my kernel instead. Thank you!
I'm now using the version from Linux 4.9.4, and the system also freezes when closing gbsim.
Except this time no error is logged :sob:
Yeah, it is not, and will not be in 4.9.x, but it is in the staging (-next) tree from @gregkh
Please see this commit, it is the one that you need to have to make sure timesync was removed.
Cheers, Rui
I'll try that, thanks!
Hm, yeah, I'm not going to keep the kernel project up to date here, that's not going to be possible, or really even wanted, over time.
But, gbsim crashing isn't good, I'll take patches to fix that anytime :)
So this should be a gbsim bug, not a greybus bug, if you wish to file it under that repo.
If the OS system crashes, it's a kernel bug ;)
Anyway, I'll try to build from the staging tree and report back. Thanks!
I launched gbsim with the wrong hotplug base directory, and my system crashed :disappointed: This happens consistently (It took me a few reboots to realize what was wrong)
On the other hand, if I launch gbsim specifying the correct base directory, the system crashes when I use Ctrl-C to close it.
I'm on an Ubuntu 16.10 using kernel 4.8.0-34
The error happens on
gb_timesync_platform_unlock_bus
This is a horrible way of pasting the error, but since the system freezes, that's what I can do... http://i.imgur.com/ex0Rv2V.jpg (dmesg -w
on the left, commands to reproduce the problem on the right)Any help fixing this would be greatly appreciated