Aquantia / AQtion

Aquantia AQC multigigabit NIC linux driver (atlantic) - development preview
https://www.aquantia.com
83 stars 29 forks source link

Version 2.2.7 crashes on link up #9

Closed hogliux closed 4 years ago

hogliux commented 4 years ago

On Ubuntu 18.04, after loading the atlantic kernel module, it's impossible to bring the NIC into a link up state. Issuing the following command:

sudo ip link set up dev enp60s0

segfaults the ip command and the following kernel panic is printed on dmesg:

[ 2959.858742] kernel BUG at ./include/linux/netdevice.h:502!
[ 2959.858752] invalid opcode: 0000 [#1] SMP NOPTI
[ 2959.858753] Modules linked in: atlantic(OE) ptp pps_core crc_itu_t hid_generic hidp thunderbolt rfcomm vmnet(OE) vmw_vsock_vmci_transport vsock vmw_vmci vmmon(OE) cmac bnep arc4 binfmt_misc nls_iso8859_1 snd_hda_codec_hdmi snd_soc_skl snd_soc_skl_ipc snd_hda_codec_realtek snd_hda_ext_core snd_hda_codec_generic snd_soc_sst_dsp snd_soc_sst_ipc snd_soc_acpi snd_soc_core joydev snd_compress ac97_bus snd_pcm_dmaengine snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm intel_rapl hid_multitouch x86_pkg_temp_thermal intel_powerclamp coretemp snd_seq_midi kvm_intel snd_seq_midi_event wmi_bmof dell_wmi dell_laptop kvm intel_wmi_thunderbolt snd_rawmidi irqbypass dell_smbios crct10dif_pclmul crc32_pclmul ath10k_pci dell_wmi_descriptor ath10k_core ghash_clmulni_intel dcdbas pcbc snd_seq ath mac80211
[ 2959.858797]  snd_seq_device snd_timer aesni_intel aes_x86_64 snd crypto_simd glue_helper i915 cfg80211 cryptd rtsx_pci_ms memstick input_leds intel_cstate intel_rapl_perf uvcvideo serio_raw soundcore drm_kms_helper drm btusb videobuf2_vmalloc videobuf2_memops btrtl btbcm videobuf2_v4l2 i2c_algo_bit btintel videobuf2_core fb_sys_fops bluetooth shpchp syscopyarea videodev mei_me idma64 media virt_dma ecdh_generic sysfillrect mei intel_lpss_pci sysimgblt intel_lpss processor_thermal_device intel_soc_dts_iosf intel_pch_thermal wmi video int3400_thermal int3403_thermal mac_hid acpi_thermal_rel intel_hid int340x_thermal_zone acpi_pad sparse_keymap sch_fq_codel parport_pc ppdev lp parport sunrpc ip_tables x_tables autofs4 rtsx_pci_sdmmc nvme psmouse nvme_core rtsx_pci i2c_hid hid pinctrl_sunrisepoint [last unloaded: atlantic]
[ 2959.858844] CPU: 4 PID: 9490 Comm: ip Tainted: G           OE    4.15.0-66-generic #75-Ubuntu
[ 2959.858846] Hardware name: Dell Inc. XPS 13 9370/0F6P3V, BIOS 1.11.1 07/11/2019
[ 2959.858856] RIP: 0010:aq_vec_start+0xa4/0xb0 [atlantic]
[ 2959.858858] RSP: 0018:ffffb4d7c3f77620 EFLAGS: 00010246
[ 2959.858860] RAX: 0000000000000000 RBX: ffff915a4674c000 RCX: 0000000000000005
[ 2959.858861] RDX: 0000000000000000 RSI: ffffb4d7c4565b08 RDI: ffff915a4674f000
[ 2959.858863] RBP: ffffb4d7c3f77638 R08: 00000000a0000800 R09: 0000000000000000
[ 2959.858864] R10: 0000000000000000 R11: 00000000000000e9 R12: ffff915a4674c628
[ 2959.858866] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000001
[ 2959.858868] FS:  00007f0bf38c80c0(0000) GS:ffff915aee500000(0000) knlGS:0000000000000000
[ 2959.858869] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2959.858871] CR2: 00007f0bf2dc8240 CR3: 00000003ce052005 CR4: 00000000003606e0
[ 2959.858872] Call Trace:
[ 2959.858880]  aq_nic_start+0x93/0x310 [atlantic]
[ 2959.858885]  aq_ndev_open+0x43/0x50 [atlantic]
[ 2959.858890]  __dev_open+0xd3/0x160
[ 2959.858893]  __dev_change_flags+0x17e/0x1c0
[ 2959.858896]  dev_change_flags+0x29/0x60
[ 2959.858899]  do_setlink+0x337/0xed0
[ 2959.858904]  ? nla_parse+0x35/0x110
[ 2959.858907]  rtnl_newlink+0x5f3/0x930
[ 2959.858913]  ? security_capset+0x10/0x90
[ 2959.858916]  ? ns_capable_common+0x68/0x80
[ 2959.858918]  ? ns_capable+0x13/0x20
[ 2959.858921]  rtnetlink_rcv_msg+0x221/0x2b0
[ 2959.858924]  ? _cond_resched+0x19/0x40
[ 2959.858927]  ? rtnl_calcit.isra.25+0x110/0x110
[ 2959.858929]  netlink_rcv_skb+0x54/0x130
[ 2959.858932]  rtnetlink_rcv+0x15/0x20
[ 2959.858935]  netlink_unicast+0x19e/0x240
[ 2959.858938]  netlink_sendmsg+0x2d1/0x3d0
[ 2959.858942]  sock_sendmsg+0x3e/0x50
[ 2959.858944]  ___sys_sendmsg+0x2a0/0x2f0
[ 2959.858947]  ? sock_destroy_inode+0x2f/0x40
[ 2959.858950]  ? destroy_inode+0x3e/0x60
[ 2959.858953]  ? evict+0x139/0x1a0
[ 2959.858956]  ? iput+0x19c/0x230
[ 2959.858959]  ? dentry_free+0x4d/0x90
[ 2959.858961]  ? __dentry_kill+0x129/0x170
[ 2959.858963]  ? dput.part.26+0x1bd/0x200
[ 2959.858966]  __sys_sendmsg+0x54/0x90
[ 2959.858968]  ? __sys_sendmsg+0x54/0x90
[ 2959.858971]  SyS_sendmsg+0x12/0x20
[ 2959.858975]  do_syscall_64+0x73/0x130
[ 2959.858978]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[ 2959.858980] RIP: 0033:0x7f0bf2ddad04
[ 2959.858981] RSP: 002b:00007ffe7ced1848 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[ 2959.858983] RAX: ffffffffffffffda RBX: 000000005db8680c RCX: 00007f0bf2ddad04
[ 2959.858985] RDX: 0000000000000000 RSI: 00007ffe7ced18a0 RDI: 0000000000000003
[ 2959.858986] RBP: 0000000000000000 R08: 0000000000000010 R09: 00007ffe7ced1940
[ 2959.858987] R10: 0000000000000015 R11: 0000000000000246 R12: 0000000000000000
[ 2959.858989] R13: 000055f7d081c020 R14: 00007ffe7ced2078 R15: 00007ffe7ced1928
[ 2959.858991] Code: 41 5c 41 5d 5d c3 31 c0 48 8b 93 38 04 00 00 83 e2 01 74 17 f0 80 a3 38 04 00 00 fe f0 80 a3 38 04 00 00 f7 5b 41 5c 41 5d 5d c3 <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 
[ 2959.859028] RIP: aq_vec_start+0xa4/0xb0 [atlantic] RSP: ffffb4d7c3f77620
hogliux commented 4 years ago

Version 2.2.6 works fine btw so this is a regression in 2.2.7

cail commented 4 years ago

Thanks, that seems to be fixed in later versions and in upstream, but we did not pushed that onto github. Please tryout version 2.3.1.

hogliux commented 4 years ago

Thank you. 2.3.1 works fine.

hogliux commented 4 years ago

Slightly off-topic, I'm trying to get the AQC107 to work with gptp. I saw on one of your kernel patches that you tested the atlantic kernel module wiht gptp, but for me the AQC107 never exposes a /dev/ptp hardware clock only hardware timestamping, i.e. ethtool -T enp60s0 print this:

Capabilities:
    hardware-transmit     (SOF_TIMESTAMPING_TX_HARDWARE)
    software-transmit     (SOF_TIMESTAMPING_TX_SOFTWARE)
    hardware-receive      (SOF_TIMESTAMPING_RX_HARDWARE)
    software-receive      (SOF_TIMESTAMPING_RX_SOFTWARE)
    software-system-clock (SOF_TIMESTAMPING_SOFTWARE)
    hardware-raw-clock    (SOF_TIMESTAMPING_RAW_HARDWARE)
PTP Hardware Clock: none
Hardware Transmit Timestamp Modes:
    off                   (HWTSTAMP_TX_OFF)
    on                    (HWTSTAMP_TX_ON)
Hardware Receive Filter Modes:
    none                  (HWTSTAMP_FILTER_NONE)

gptp fails with:


ERROR    : GPTP [12:13:24:739] Group ptp not found, will try root (0) instead
ERROR    : GPTP [12:13:24:740] Failed to configure timestamping: Operation not supported
ERROR    : GPTP [12:13:24:740] post_init failed```
hogliux commented 4 years ago

Ignore me, just needed a newer firmware version