Warnings on dirty socket destruction during regression.test_stress_pipeline.Pipeline test

vankoven commented 6 years ago

Tempesta is at 7ee5ded650e97d986f39b278294b824932a9b88e linux kernel is at 4.9.35-tfw6 tag (latest release). Simply run regression.test_stress_pipeline.Pipeline functional tests and multiple oopses will happen on wrk shutdown (client disconnects). At least 100 concurrent connections was required to reproduce the issue.

Test log:

% ./run_tests.py -vvv regression.test_stress_pipeline.Pipeline
Not resuming: File tests_resume.json not found

----------------------------------------------------------------------
Running functional tests...
----------------------------------------------------------------------

test_pipelined_requests (regression.test_stress_pipeline.Pipeline) ... 
    Init test case...
    Starting Nginx on 192.168.122.12:8000   Starting Nginx on 192.168.122.12:8002

    Starting Nginx on 192.168.122.12:8004   Starting Nginx on 192.168.122.12:8006

    Starting Nginx on 192.168.122.12:8003
    Starting Nginx on 192.168.122.12:8001
    Starting Nginx on 192.168.122.12:8007
    Starting Nginx on 192.168.122.12:8005
    Starting Nginx on 192.168.122.12:8008
    Starting Nginx on 192.168.122.12:8010
    Starting Nginx on 192.168.122.12:8012
    Starting Nginx on 192.168.122.12:8014
    Starting Nginx on 192.168.122.12:8009
    Starting Nginx on 192.168.122.12:8011
    Starting Nginx on 192.168.122.12:8013   Starting Nginx on 192.168.122.12:8015

    Starting Nginx on 192.168.122.12:8016
    Starting Nginx on 192.168.122.12:8018
    Starting Nginx on 192.168.122.12:8017
    Starting Nginx on 192.168.122.12:8020
    Starting Nginx on 192.168.122.12:8022
    Starting Nginx on 192.168.122.12:8019
    Starting Nginx on 192.168.122.12:8024
    Starting Nginx on 192.168.122.12:8023
    Starting Nginx on 192.168.122.12:8021
    Starting Nginx on 192.168.122.12:8025
    Starting Nginx on 192.168.122.12:8026
    Starting Nginx on 192.168.122.12:8028
    Starting Nginx on 192.168.122.12:8030
    Starting Nginx on 192.168.122.12:8027
    Starting Nginx on 192.168.122.12:8029
    Starting Nginx on 192.168.122.12:8031
    Starting TempestaFW on 192.168.122.12
    Running 1 HTTP clients on localhost
Running 10s test @ http://192.168.122.12/
  4 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    25.43ms   19.63ms 178.98ms   80.18%
    Req/Sec     6.03k   562.57     7.72k    69.25%
  241173 requests in 10.04s, 4.15GB read
Requests/sec:  24022.62
Transfer/sec:    423.26MB
---- RESULTS --------
Status 200 : 241173 times
---- END ------------

    Client: errors: 0, requests: 241173, rate: 24022
    Clients in total: errors: 0, requests: 241173, rate: 24022
    Stoping TempestaFW on 192.168.122.12
    Stoping Nginx on 192.168.122.12:8000
    Stoping Nginx on 192.168.122.12:8002
    Stoping Nginx on 192.168.122.12:8004
    Stoping Nginx on 192.168.122.12:8006
    Removing Nginx config for 192.168.122.12:8000
    Removing Nginx config for 192.168.122.12:8002
    Removing Nginx config for 192.168.122.12:8006
    Removing Nginx config for 192.168.122.12:8004
    Stoping Nginx on 192.168.122.12:8005
    Stoping Nginx on 192.168.122.12:8001
    Stoping Nginx on 192.168.122.12:8007
    Stoping Nginx on 192.168.122.12:8003
    Removing Nginx config for 192.168.122.12:8001
    Removing Nginx config for 192.168.122.12:8005
    Removing Nginx config for 192.168.122.12:8003
    Removing Nginx config for 192.168.122.12:8007
    Stoping Nginx on 192.168.122.12:8008
    Stoping Nginx on 192.168.122.12:8010
    Stoping Nginx on 192.168.122.12:8012
    Stoping Nginx on 192.168.122.12:8014
    Removing Nginx config for 192.168.122.12:8014
    Removing Nginx config for 192.168.122.12:8012
    Removing Nginx config for 192.168.122.12:8010
    Removing Nginx config for 192.168.122.12:8008
    Stoping Nginx on 192.168.122.12:8011
    Stoping Nginx on 192.168.122.12:8015
    Stoping Nginx on 192.168.122.12:8013
    Stoping Nginx on 192.168.122.12:8009
    Removing Nginx config for 192.168.122.12:8015
    Removing Nginx config for 192.168.122.12:8011
    Removing Nginx config for 192.168.122.12:8013
    Stoping Nginx on 192.168.122.12:8016
    Removing Nginx config for 192.168.122.12:8009
    Removing Nginx config for 192.168.122.12:8016
    Stoping Nginx on 192.168.122.12:8018
    Stoping Nginx on 192.168.122.12:8020
    Stoping Nginx on 192.168.122.12:8017
    Stoping Nginx on 192.168.122.12:8022
    Removing Nginx config for 192.168.122.12:8018
    Removing Nginx config for 192.168.122.12:8020
    Removing Nginx config for 192.168.122.12:8017
    Removing Nginx config for 192.168.122.12:8022
    Stoping Nginx on 192.168.122.12:8019
    Stoping Nginx on 192.168.122.12:8021
    Stoping Nginx on 192.168.122.12:8024
    Stoping Nginx on 192.168.122.12:8023
    Removing Nginx config for 192.168.122.12:8019
    Removing Nginx config for 192.168.122.12:8021
    Removing Nginx config for 192.168.122.12:8024
    Removing Nginx config for 192.168.122.12:8023
    Stoping Nginx on 192.168.122.12:8026
    Stoping Nginx on 192.168.122.12:8028
    Stoping Nginx on 192.168.122.12:8025
    Stoping Nginx on 192.168.122.12:8030
    Removing Nginx config for 192.168.122.12:8026
    Removing Nginx config for 192.168.122.12:8028
    Removing Nginx config for 192.168.122.12:8025
    Removing Nginx config for 192.168.122.12:8030
    Stoping Nginx on 192.168.122.12:8027
    Stoping Nginx on 192.168.122.12:8029
    Stoping Nginx on 192.168.122.12:8031
    Removing Nginx config for 192.168.122.12:8027
    Removing Nginx config for 192.168.122.12:8029
    Removing Nginx config for 192.168.122.12:8031
ok

----------------------------------------------------------------------
Ran 1 test in 22.256s

OK

Kernel log (a part of it since it was flooded and overflowed):

[154655.676795] ------------[ cut here ]------------
[154655.676797] WARNING: CPU: 0 PID: 3 at /home/user/kernel/linux-packages/net/ipv4/af_inet.c:155 inet_sock_destruct+0x1f1/0x210
[154655.676814] Modules linked in: tfw_sched_ratio(O) tfw_sched_http(O) tfw_sched_hash(O) tempesta_fw(O) tempesta_db(O) tempesta_tls(O) tcp_diag inet_diag xt_tcpudp iptable_filter crct10dif_pclmul crc32_pclmul iTCO_wdt iTCO_vendor_support snd_hda_codec_generic ppdev snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm ghash_clmulni_intel snd_timer sg snd soundcore virtio_balloon virtio_console binfmt_misc qxl ttm evdev drm_kms_helper serio_raw pcspkr lpc_ich mfd_core drm shpchp parport_pc parport button ip_tables x_tables autofs4 ext4 crc16 jbd2 crc32c_generic fscrypto ecb mbcache sr_mod cdrom crc32c_intel virtio_blk aesni_intel virtio_net aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd ahci libahci psmouse sym53c8xx libata scsi_transport_spi i2c_i801 i2c_smbus ehci_pci uhci_hcd ehci_hcd
[154655.676816]  usbcore usb_common scsi_mod virtio_pci virtio_ring virtio [last unloaded: tempesta_tls]
[154655.676817] CPU: 0 PID: 3 Comm: ksoftirqd/0 Tainted: G        W  O    4.9.0-tempesta-amd64 #1 Debian 4.9.35-tfw6-1
[154655.676817] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.11.0-20171110_100015-anatol 04/01/2014
[154655.676819]  0000000000000000 ffffffffb392f194 0000000000000000 0000000000000000
[154655.676820]  ffffffffb367adee ffff9da78b9c6900 ffff9da78b9c6a88 00000000000001cb
[154655.676821]  ffff9da7bfc19200 000000000001a780 ffffffffc0c785f8 ffffffffb3b84ab1
[154655.676822] Call Trace:
[154655.676824]  [<ffffffffb392f194>] ? dump_stack+0x5c/0x78
[154655.676825]  [<ffffffffb367adee>] ? __warn+0xbe/0xe0
[154655.676827]  [<ffffffffb3b84ab1>] ? inet_sock_destruct+0x1f1/0x210
[154655.676829]  [<ffffffffb3aee561>] ? __sk_destruct+0x21/0x190
[154655.676837]  [<ffffffffc083110c>] ? ss_tx_action+0x1dc/0x4b0 [tempesta_fw]
[154655.676839]  [<ffffffffb3b08d08>] ? process_backlog+0x88/0x130
[154655.676841]  [<ffffffffb3b0493b>] ? net_tx_action+0x9b/0x1d0
[154655.676843]  [<ffffffffb3c10dda>] ? __do_softirq+0x10a/0x29a
[154655.676845]  [<ffffffffb369dcd0>] ? sort_range+0x20/0x20
[154655.676846]  [<ffffffffb3680e15>] ? run_ksoftirqd+0x25/0x40
[154655.676848]  [<ffffffffb369ddce>] ? smpboot_thread_fn+0xfe/0x150
[154655.676850]  [<ffffffffb369a577>] ? kthread+0xd7/0xf0
[154655.676851]  [<ffffffffb369a4a0>] ? kthread_park+0x60/0x60
[154655.676853]  [<ffffffffb3c0e535>] ? ret_from_fork+0x25/0x30
[154655.676854] ---[ end trace 58b01fd3af7038e2 ]---
[154655.676856] ------------[ cut here ]------------
[154655.676858] WARNING: CPU: 0 PID: 3 at /home/user/kernel/linux-packages/net/ipv4/af_inet.c:154 inet_sock_destruct+0x1d2/0x210
[154655.676883] Modules linked in: tfw_sched_ratio(O) tfw_sched_http(O) tfw_sched_hash(O) tempesta_fw(O) tempesta_db(O) tempesta_tls(O) tcp_diag inet_diag xt_tcpudp iptable_filter crct10dif_pclmul crc32_pclmul iTCO_wdt iTCO_vendor_support snd_hda_codec_generic ppdev snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm ghash_clmulni_intel snd_timer sg snd soundcore virtio_balloon virtio_console binfmt_misc qxl ttm evdev drm_kms_helper serio_raw pcspkr lpc_ich mfd_core drm shpchp parport_pc parport button ip_tables x_tables autofs4 ext4 crc16 jbd2 crc32c_generic fscrypto ecb mbcache sr_mod cdrom crc32c_intel virtio_blk aesni_intel virtio_net aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd ahci libahci psmouse sym53c8xx libata scsi_transport_spi i2c_i801 i2c_smbus ehci_pci uhci_hcd ehci_hcd
[154655.676886]  usbcore usb_common scsi_mod virtio_pci virtio_ring virtio [last unloaded: tempesta_tls]
[154655.676888] CPU: 0 PID: 3 Comm: ksoftirqd/0 Tainted: G        W  O    4.9.0-tempesta-amd64 #1 Debian 4.9.35-tfw6-1
[154655.676888] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.11.0-20171110_100015-anatol 04/01/2014
[154655.676890]  0000000000000000 ffffffffb392f194 0000000000000000 0000000000000000
[154655.676892]  ffffffffb367adee ffff9da78b9c3480 ffff9da78b9c3608 00000000000001cb
[154655.676894]  ffff9da7bfc19200 000000000001a780 ffffffffc0c785f8 ffffffffb3b84a92
[154655.676895] Call Trace:
[154655.676897]  [<ffffffffb392f194>] ? dump_stack+0x5c/0x78
[154655.676898]  [<ffffffffb367adee>] ? __warn+0xbe/0xe0
[154655.676901]  [<ffffffffb3b84a92>] ? inet_sock_destruct+0x1d2/0x210
[154655.676902]  [<ffffffffb3aee561>] ? __sk_destruct+0x21/0x190
[154655.676911]  [<ffffffffc083110c>] ? ss_tx_action+0x1dc/0x4b0 [tempesta_fw]
[154655.676913]  [<ffffffffb3b08d08>] ? process_backlog+0x88/0x130
[154655.676914]  [<ffffffffb3b0493b>] ? net_tx_action+0x9b/0x1d0
[154655.676916]  [<ffffffffb3c10dda>] ? __do_softirq+0x10a/0x29a
[154655.676917]  [<ffffffffb369dcd0>] ? sort_range+0x20/0x20
[154655.676918]  [<ffffffffb3680e15>] ? run_ksoftirqd+0x25/0x40
[154655.676919]  [<ffffffffb369ddce>] ? smpboot_thread_fn+0xfe/0x150
[154655.676921]  [<ffffffffb369a577>] ? kthread+0xd7/0xf0
[154655.676923]  [<ffffffffb369a4a0>] ? kthread_park+0x60/0x60
[154655.676924]  [<ffffffffb3c0e535>] ? ret_from_fork+0x25/0x30
[154655.676924] ---[ end trace 58b01fd3af7038e3 ]---
[154655.068546] ------------[ cut here ]------------
[154655.070104] WARNING: CPU: 0 PID: 9268 at /home/user/kernel/linux-packages/net/core/stream.c:203 sk_stream_kill_queues+0x11f/0x160
[154655.073075] Modules linked in: tfw_sched_ratio(O) tfw_sched_http(O) tfw_sched_hash(O) tempesta_fw(O) tempesta_db(O) tempesta_tls(O) tcp_diag inet_diag xt_tcpudp iptable_filter crct10dif_pclmul crc32_pclmul iTCO_wdt iTCO_vendor_support snd_hda_codec_generic ppdev snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm ghash_clmulni_intel snd_timer sg snd soundcore virtio_balloon virtio_console binfmt_misc qxl ttm evdev drm_kms_helper serio_raw pcspkr lpc_ich mfd_core drm shpchp parport_pc parport button ip_tables x_tables autofs4 ext4 crc16 jbd2 crc32c_generic fscrypto ecb mbcache sr_mod cdrom crc32c_intel virtio_blk aesni_intel virtio_net aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd ahci libahci psmouse sym53c8xx libata scsi_transport_spi i2c_i801 i2c_smbus ehci_pci uhci_hcd ehci_hcd
[154655.092574]  usbcore usb_common scsi_mod virtio_pci virtio_ring virtio [last unloaded: tempesta_tls]
[154655.095211] CPU: 0 PID: 9268 Comm: systemd-journal Tainted: G        W  O    4.9.0-tempesta-amd64 #1 Debian 4.9.35-tfw6-1
[154655.098253] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.11.0-20171110_100015-anatol 04/01/2014
[154655.100952]  0000000000000000 ffffffffb392f194 0000000000000000 0000000000000000
[154655.103616]  ffffffffb367adee ffff9da79bcd6180 ffff9da79bcd6308 ffff9da798e47100
[154655.106208]  ffff9da79a717a62 0000000000000001 ffff9da79bcd6180 ffffffffb3afa72f
[154655.108750] Call Trace:
[154655.109289]  <IRQ> 
[154655.109913]  [<ffffffffb392f194>] ? dump_stack+0x5c/0x78
[154655.111118]  [<ffffffffb367adee>] ? __warn+0xbe/0xe0
[154655.112557]  [<ffffffffb3afa72f>] ? sk_stream_kill_queues+0x11f/0x160
[154655.113878]  [<ffffffffb3b5404f>] ? inet_csk_destroy_sock+0x4f/0x160
[154655.115816]  [<ffffffffc08309ee>] ? ss_linkerror+0xe/0x60 [tempesta_fw]
[154655.117370]  [<ffffffffb3b61ade>] ? tcp_validate_incoming+0x27e/0x370
[154655.118895]  [<ffffffffb3b62bab>] ? tcp_rcv_established+0x19b/0x6c0
[154655.120802]  [<ffffffffb3b20d4d>] ? sk_filter_trim_cap+0x2d/0x2c0
[154655.122821]  [<ffffffffb3b6d793>] ? tcp_v4_do_rcv+0x133/0x1f0
[154655.124723]  [<ffffffffb3b6ef37>] ? tcp_v4_rcv+0x8a7/0x9b0
[154655.126145]  [<ffffffffb3b484e9>] ? ip_local_deliver_finish+0x99/0x1c0
[154655.127948]  [<ffffffffb3b487ab>] ? ip_local_deliver+0x6b/0xf0
[154655.129392]  [<ffffffffb3b6e602>] ? tcp_v4_early_demux+0xb2/0x140
[154655.130242]  [<ffffffffb3b48450>] ? ip_rcv_finish+0x3f0/0x3f0
[154655.131337]  [<ffffffffb3b48ab6>] ? ip_rcv+0x286/0x3d0
[154655.132992]  [<ffffffffb3b48060>] ? inet_del_offload+0x40/0x40
[154655.134862]  [<ffffffffb3b07669>] ? __netif_receive_skb_core+0x4f9/0xa00
[154655.137026]  [<ffffffffb36ac400>] ? update_cfs_rq_load_avg+0x1a0/0x490
[154655.139044]  [<ffffffffc083c3ed>] ? tfw_wq_pop_ticket+0x7d/0xc90 [tempesta_fw]
[154655.140900]  [<ffffffffb3b08d08>] ? process_backlog+0x88/0x130
[154655.142180]  [<ffffffffb3b08420>] ? net_rx_action+0x240/0x370
[154655.143367]  [<ffffffffb3c10dda>] ? __do_softirq+0x10a/0x29a
[154655.144276]  [<ffffffffb3680f4e>] ? irq_exit+0xae/0xb0
[154655.145727]  [<ffffffffb3c104a2>] ? call_function_single_interrupt+0x82/0x90
[154655.148726]  <EOI> 
[154655.149475]  [<ffffffffb382a370>] ? mntput_no_expire+0x170/0x170
[154655.150562]  [<ffffffffb3812989>] ? terminate_walk+0x89/0xf0
[154655.151603]  [<ffffffffb3816162>] ? path_lookupat+0x112/0x120
[154655.152583]  [<ffffffffb3818a21>] ? filename_lookup+0xb1/0x180
[154655.153744]  [<ffffffffb380490a>] ? __check_object_size+0xfa/0x1e0
[154655.155617]  [<ffffffffb395d558>] ? strncpy_from_user+0x48/0x160
[154655.157126]  [<ffffffffb381865a>] ? getname_flags+0x6a/0x1e0
[154655.158719]  [<ffffffffb380643d>] ? SyS_access+0xad/0x220
[154655.160294]  [<ffffffffb3603b1c>] ? do_syscall_64+0x7c/0xf0
[154655.162050]  [<ffffffffb3c0e36f>] ? entry_SYSCALL64_slow_path+0x25/0x25
[154655.163296] ---[ end trace 58b01fd3af7038d7 ]---
[154655.164605] ------------[ cut here ]------------
[154655.165930] WARNING: CPU: 0 PID: 9268 at /home/user/kernel/linux-packages/net/core/stream.c:204 inet_csk_destroy_sock+0x4f/0x160
[154655.169386] Modules linked in: tfw_sched_ratio(O) tfw_sched_http(O) tfw_sched_hash(O) tempesta_fw(O) tempesta_db(O) tempesta_tls(O) tcp_diag inet_diag xt_tcpudp iptable_filter crct10dif_pclmul crc32_pclmul iTCO_wdt iTCO_vendor_support snd_hda_codec_generic ppdev snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm ghash_clmulni_intel snd_timer sg snd soundcore virtio_balloon virtio_console binfmt_misc qxl ttm evdev drm_kms_helper serio_raw pcspkr lpc_ich mfd_core drm shpchp parport_pc parport button ip_tables x_tables autofs4 ext4 crc16 jbd2 crc32c_generic fscrypto ecb mbcache sr_mod cdrom crc32c_intel virtio_blk aesni_intel virtio_net aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd ahci libahci psmouse sym53c8xx libata scsi_transport_spi i2c_i801 i2c_smbus ehci_pci uhci_hcd ehci_hcd
[154655.191281]  usbcore usb_common scsi_mod virtio_pci virtio_ring virtio [last unloaded: tempesta_tls]
[154655.193902] CPU: 0 PID: 9268 Comm: systemd-journal Tainted: G        W  O    4.9.0-tempesta-amd64 #1 Debian 4.9.35-tfw6-1
[154655.196550] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.11.0-20171110_100015-anatol 04/01/2014
[154655.199111]  0000000000000000 ffffffffb392f194 0000000000000000 0000000000000000
[154655.201252]  ffffffffb367adee ffff9da79bcd6180 0000000000000001 ffff9da798e47100
[154655.203454]  ffff9da79a717a62 0000000000000001 ffff9da79bcd6180 ffffffffb3b5404f
[154655.205529] Call Trace:
[154655.205989]  <IRQ> 
[154655.206474]  [<ffffffffb392f194>] ? dump_stack+0x5c/0x78
[154655.207830]  [<ffffffffb367adee>] ? __warn+0xbe/0xe0
[154655.208927]  [<ffffffffb3b5404f>] ? inet_csk_destroy_sock+0x4f/0x160
[154655.210258]  [<ffffffffc08309ee>] ? ss_linkerror+0xe/0x60 [tempesta_fw]
[154655.212398]  [<ffffffffb3b61ade>] ? tcp_validate_incoming+0x27e/0x370
[154655.214254]  [<ffffffffb3b62bab>] ? tcp_rcv_established+0x19b/0x6c0
[154655.216208]  [<ffffffffb3b20d4d>] ? sk_filter_trim_cap+0x2d/0x2c0
[154655.217747]  [<ffffffffb3b6d793>] ? tcp_v4_do_rcv+0x133/0x1f0
[154655.219076]  [<ffffffffb3b6ef37>] ? tcp_v4_rcv+0x8a7/0x9b0
[154655.220569]  [<ffffffffb3b484e9>] ? ip_local_deliver_finish+0x99/0x1c0
[154655.222480]  [<ffffffffb3b487ab>] ? ip_local_deliver+0x6b/0xf0
[154655.224354]  [<ffffffffb3b6e602>] ? tcp_v4_early_demux+0xb2/0x140
[154655.226479]  [<ffffffffb3b48450>] ? ip_rcv_finish+0x3f0/0x3f0
[154655.228320]  [<ffffffffb3b48ab6>] ? ip_rcv+0x286/0x3d0
[154655.229684]  [<ffffffffb3b48060>] ? inet_del_offload+0x40/0x40
[154655.231281]  [<ffffffffb3b07669>] ? __netif_receive_skb_core+0x4f9/0xa00
[154655.232497]  [<ffffffffb36ac400>] ? update_cfs_rq_load_avg+0x1a0/0x490
[154655.233843]  [<ffffffffc083c3ed>] ? tfw_wq_pop_ticket+0x7d/0xc90 [tempesta_fw]
[154655.235407]  [<ffffffffb3b08d08>] ? process_backlog+0x88/0x130
[154655.237355]  [<ffffffffb3b08420>] ? net_rx_action+0x240/0x370
[154655.239297]  [<ffffffffb3c10dda>] ? __do_softirq+0x10a/0x29a
[154655.241292]  [<ffffffffb3680f4e>] ? irq_exit+0xae/0xb0
[154655.242962]  [<ffffffffb3c104a2>] ? call_function_single_interrupt+0x82/0x90
[154655.245299]  <EOI> 
[154655.245844]  [<ffffffffb382a370>] ? mntput_no_expire+0x170/0x170
[154655.247903]  [<ffffffffb3812989>] ? terminate_walk+0x89/0xf0
[154655.249826]  [<ffffffffb3816162>] ? path_lookupat+0x112/0x120
[154655.251450]  [<ffffffffb3818a21>] ? filename_lookup+0xb1/0x180
[154655.253438]  [<ffffffffb380490a>] ? __check_object_size+0xfa/0x1e0
[154655.255357]  [<ffffffffb395d558>] ? strncpy_from_user+0x48/0x160
[154655.257411]  [<ffffffffb381865a>] ? getname_flags+0x6a/0x1e0
[154655.259077]  [<ffffffffb380643d>] ? SyS_access+0xad/0x220
[154655.260919]  [<ffffffffb3603b1c>] ? do_syscall_64+0x7c/0xf0
[154655.262764]  [<ffffffffb3c0e36f>] ? entry_SYSCALL64_slow_path+0x25/0x25
[154655.265101] ---[ end trace 58b01fd3af7038d8 ]---

Seems that same Oopses was seen in #692

krizhanovsky commented 6 years ago

Probably the same problem with non-updating socket memory after adjusting skb size as in https://github.com/tempesta-tech/linux-4.9.35-tfw/commit/60883ec7aeec1000af98b69e62ec173ff5c1b988

krizhanovsky commented 6 years ago

Just hit the bug on Tempesta FW proxying 5000-byte index.html sericed by Nginx running on the same VM with Tempesta FW:

# cat ~/tempesta/etc/tempesta_fw.conf
listen 192.168.100.4:80;
server 127.0.0.1:9090;
cache 0;

# ulimit -n 65536; ./wrk -c 4096 -t 8 -d 30 http://192.168.100.4:80/

[11329969575.738787] [tdb] Start Tempesta DB
[11329969575.759262] [tempesta] Initializing Tempesta FW kernel module...
[11329969575.768698] [tempesta] Registering new scheduler: hash 
[11329969575.776608] [tempesta] Registering new scheduler: http
[11329969575.807501] [tdb] Opened table /opt/tempesta/db/filter.tdb: size=16777216 rec_size=20 base=ffff8d0cba800000
[11329969589.422002] TCP: request_sock_TCP: Possible SYN flooding on port 80. Sending cookies.  Check SNMP counters.
[11329969616.403364] ------------[ cut here ]------------
[11329969616.407433] WARNING: CPU: 3 PID: 28 at net/core/stream.c:205 sk_stream_kill_queues+0x106/0x120
[11329969616.411813] Modules linked in: tfw_sched_ratio(O) tfw_sched_http(O) tfw_sched_hash(O) tempesta_fw(O) tempesta_db(O) tempesta_tls(O) tcp_diag inet_diag kmemcpy(O) binfmt_misc crct10dif_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 bochs_drm ttm drm_kms_helper drm fb_sys_fops syscopyarea crypto_simd glue_helper cryptd ppdev parport_pc sysfillrect sysimgblt parport serio_raw sg button pcspkr ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 fscrypto sr_mod cdrom sd_mod ata_generic ata_piix libata i2c_piix4 psmouse scsi_mod e1000 [last unloaded: tempesta_tls]
[11329969616.430415] CPU: 3 PID: 28 Comm: ksoftirqd/3 Tainted: G           O    4.14.32-kdump+ #1
[11329969616.433921] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014
[11329969616.437326] task: ffff8d0c9acfb1c0 task.stack: ffffafce00418000 
[11329969616.441104] RIP: 0010:sk_stream_kill_queues+0x106/0x120
[11329969616.444787] RSP: 0018:ffffafce0041bb08 EFLAGS: 00010286
[11329969616.447491] RAX: 00000000ffffff00 RBX: ffff8d0c9a6f7000 RCX: 0000000000000020
[11329969616.449393] RDX: ffffffffffffffe0 RSI: 0000000000000100 RDI: ffff8d0c9a6f70b0
[11329969616.451050] RBP: ffff8d0c9a6f70b0 R08: 0000000000000000 R09: 0000000000000000
[11329969616.452854] R10: 0000000000000001 R11: 0000000000015bc6 R12: ffff8d0c9a6f7158
[11329969616.454647] R13: ffff8d0c9749fc62 R14: ffff8d0c9749fc62 R15: ffff8d0c9a6f7000
[11329969616.456506] FS:  0000000000000000(0000) GS:ffff8d0cbfd80000(0000) knlGS:0000000000000000
[11329969616.458783] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[11329969616.462618] CR2: 00005575e4992628 CR3: 000000007c80a003 CR4: 00000000003606e0
[11329969616.464943] Call Trace:
[11329969616.466191]  inet_csk_destroy_sock+0x50/0x100
[11329969616.467667]  ss_linkerror+0x9/0x60 [tempesta_fw] 
[11329969616.469242]  tcp_reset+0x129/0x170
[11329969616.470692]  tcp_validate_incoming+0x124/0x3f0
[11329969616.473524]  tcp_rcv_established+0x1ca/0x570
[11329969616.475004]  ? tcp_v4_inbound_md5_hash+0x5d/0x1b0
[11329969616.476537]  tcp_v4_do_rcv+0x124/0x1c0 
[11329969616.477840]  tcp_v4_rcv+0x93f/0xa40
[11329969616.479207]  ip_local_deliver_finish+0x95/0x1c0
[11329969616.480775]  ip_local_deliver+0x66/0xe0
[11329969616.482553]  ? tcp_v4_early_demux+0x10d/0x140
[11329969616.484080]  ? ip_rcv_finish+0x175/0x400 
[11329969616.485643]  ip_rcv+0x284/0x3b0
[11329969616.486950]  ? inet_del_offload+0x30/0x30
[11329969616.488636]  __netif_receive_skb_core+0x84a/0xb30
[11329969616.492160]  ? process_backlog+0x92/0x120
[11329969616.494935]  process_backlog+0x92/0x120
[11329969616.496660]  net_rx_action+0x261/0x3a0
[11329969616.498065]  __do_softirq+0x104/0x297

Looks very close to #926, it seems socket buffers weren't updated in all the places correctly.

krizhanovsky commented 6 years ago

Just got plenty of net/ipv4/af_inet.c:155 inet_sock_destruct warnings on tests from https://github.com/tempesta-tech/tempesta/issues/635#issuecomment-383745388 on hardware testbed, so the issues is kind of crucial.

krizhanovsky commented 5 years ago

The assertions fail on client sockets on Tempesta FW's side and seem caused by wrong data writings to the client sockets.

Able to reproduce the issue just with

wrk -d 10 -c 2 -t 1 http://192.168.100.4/index.html

, where index.html is ~19KB file. Some concurrency is required - I couldn't reproduce the issue for only one connection. Also I couldn't hit the problem for very small (~20B) files. HTTP headers adjustment seems doesn't affect the issue anyhow.

Only one backend connection is enough:

# cat etc/tempesta_fw.conf 
listen 192.168.100.4:80;

srv_group default {
    server 127.0.0.1:8080 conns_n=1;
}
vhost default {
    proxy_pass default;
}

cache 0;

http_chain {
    -> default;
}

However, bigger number of backend connections doesn't affect the issue. There are no connection resets from the Apache backend.

One more concurrency scenario to reproduce the issue is putting single connection to a backend directly and one more through Tempesta:

$ wrk -d 600 -c 1 -t 1 http://192.168.100.4:8080/index.html
$ wrk -d 5 -c 1 -t 1 http://192.168.100.4/index.html

Again, both the connections must request the big file, if only one of them requests the big file while the other one requests a small file, then the warnings don't appear.

tempesta-tech / tempesta

Warnings on dirty socket destruction during regression.test_stress_pipeline.Pipeline test #984