aparcar / openwrt

Staging tree of Paul Spooren
Other
9 stars 1 forks source link

FS#479 - Softethervpn results in ReAlloc error after vpnbridge is executed #501

Open aparcar opened 7 years ago

aparcar commented 7 years ago

ronniebrash:

On a mostly vanilla Atmel at91sam9g20-ek with 32MB of RAM, we are seeing a couple of errors using the current package software version + patches.

Overall, it may be the cross compilation process, overlay or something else entirely.

1. If the vpnbridge is already started, a NAND-WAIT stacktrace results if the binary is executed to instantiate a second instance.

root@lede:/usr/libexec/softethervpn# /usr/bin/env LANG=en_US.UTF-8 vpncmd [ 241.880000] [ 241.880000] ================================= [ 241.880000] [ INFO: inconsistent lock state ] [ 241.880000] 4.4.36 #0 Not tainted [ 241.880000] --------------------------------- [ 241.880000] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage. [ 241.880000] kswapd0/155 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 241.880000] (&f->sem){+.+.?.}, at: [] jffs2_do_clear_inode+0x28/0x114 [ 241.880000] {RECLAIM_FS-ON-W} state was registered at: [ 241.880000] [] lockdep_trace_alloc+0xf8/0x118 [ 241.880000] [] kmem_cache_alloc+0x28/0x144 [ 241.880000] [] jffs2_alloc_inode_cache+0x1c/0x28 [ 241.880000] [] jffs2_do_read_inode+0x13c/0x20c [ 241.880000] [] jffs2_iget+0x74/0x324 [ 241.880000] [] jffs2_do_fill_super+0x190/0x274 [ 241.880000] [] jffs2_fill_super+0xf4/0x144 [ 241.880000] [] mount_mtd_aux+0x80/0x10c [ 241.880000] [] mount_mtd_nr+0x7c/0x94 [ 241.880000] [] mount_mtd+0x1b0/0x210 [ 241.880000] [] jffs2_mount+0x1c/0x28 [ 241.880000] [] mount_fs+0x1c/0xac [ 241.880000] [] vfs_kern_mount+0x5c/0x12c [ 241.880000] [] do_mount+0xa04/0xbc8 [ 241.880000] [] SyS_mount+0x7c/0xa8 [ 241.880000] [] ret_fast_syscall+0x0/0x1c [ 241.880000] irq event stamp: 45867 [ 241.880000] hardirqs last enabled at (45867): [] _raw_spin_unlock_irq+0x2c/0x6c [ 241.880000] hardirqs last disabled at (45866): [] _raw_spin_lock_irq+0x28/0x70 [ 241.880000] softirqs last enabled at (45450): [] do_softirq+0x208/0x2a4 [ 241.880000] softirqs last disabled at (45431): [] irq_exit+0xa4/0xec [ 241.880000] [ 241.880000] other info that might help us debug this: [ 241.880000] Possible unsafe locking scenario: [ 241.880000] [ 241.880000] CPU0 [ 241.880000] ---- [ 241.880000] lock(&f->sem); [ 241.880000] [ 241.880000] lock(&f->sem); [ 241.880000] [ 241.880000] DEADLOCK [ 241.880000] [ 241.880000] 2 locks held by kswapd0/155: [ 241.880000] #0: (shrinker_rwsem){++++..}, at: [] shrink_slab.part.1.constprop.7+0x30/0x230 [ 241.880000] #1: (&type->s_umount_key#32){.+.+..}, at: [] trylock_super+0x1c/0x60 [ 241.880000] [ 241.880000] stack backtrace: [ 241.880000] CPU: 0 PID: 155 Comm: kswapd0 Not tainted 4.4.36 #0 [ 241.880000] Hardware name: Atmel AT91SAM9 [ 241.880000] Backtrace: [ 241.880000] [] (dump_backtrace) from [] (show_stack+0x18/0x1c) [ 241.880000] r7:0000000a r6:c38af4c0 r5:c38af0c0 r4:c068f314 [ 241.880000] [] (show_stack) from [] (dump_stack+0x20/0x28) [ 241.880000] [] (dump_stack) from [] (print_usage_bug.part.11+0x228/0x290) [ 241.880000] [] (print_usage_bug.part.11) from [] (mark_lock+0x404/0x63c) [ 241.880000] r8:c004c558 r7:c38af0c0 r6:c38af4c0 r5:00000008 r4:0000000a [ 241.880000] [] (mark_lock) from [] (lock_acquire+0x97c/0x198c) [ 241.880000] r10:c38af0c0 r9:c0667044 r8:c38af4c0 r7:00000001 r6:00000002 r5:c3675704 [ 241.880000] r4:0000025d r3:00000004 [ 241.880000] [] (__lock_acquire) from [] (lock_acquire+0x7c/0x9c) [ 241.880000] r10:00000042 r9:c334c4c4 r8:c38af0c0 r7:c0ccdd8c r6:00000001 r5:60000013 [ 241.880000] r4:00000000 [ 241.880000] [] (lock_acquire) from [] (mutex_lock_nested+0x50/0x39c) [ 241.880000] r6:c334d000 r5:c334d000 r4:c36756d0 [ 241.880000] [] (mutex_lock_nested) from [] (jffs2_do_clear_inode+0x28/0x114) [ 241.880000] r10:00000042 r9:c334c4c4 r8:c334c000 r7:00000058 r6:c334d000 r5:c334d000 [ 241.880000] r4:c36756d0 [ 241.880000] [] (jffs2_do_clear_inode) from [] (jffs2_evict_inode+0x34/0x38) [ 241.880000] r7:00000058 r6:c044ba88 r5:c334d000 r4:c3675730 [ 241.880000] [] (jffs2_evict_inode) from [] (evict+0xac/0x164) [ 241.880000] r5:c367580c r4:c3675730 [ 241.880000] [] (evict) from [] (dispose_list+0x38/0x44) [ 241.880000] r7:00000058 r6:00000063 r5:00000006 r4:c392dd58 [ 241.880000] [] (dispose_list) from [] (prune_icache_sb+0x4c/0x5c) [ 241.880000] r5:00000006 r4:c392dd58 [ 241.880000] [] (prune_icache_sb) from [] (super_cache_scan+0x140/0x184) [ 241.880000] r5:00000000 r4:c392ddd8 [ 241.880000] [] (super_cache_scan) from [] (shrink_slab.part.1.constprop.7+0x1ac/0x230) [ 241.880000] r10:00000601 r9:00000000 r8:00000000 r7:00000400 r6:00000062 r5:c334c4c4 [ 241.880000] r4:000000a5 [ 241.880000] [] (shrink_slab.part.1.constprop.7) from [] (shrink_zone+0x5f0/0x640) [ 241.880000] r10:00000004 r9:00000117 r8:c04da918 r7:00000000 r6:00000000 r5:c392ded4 [ 241.880000] r4:c04da6fc [ 241.880000] [] (shrink_zone) from [] (kswapd+0x4ac/0x668) [ 241.880000] r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c04da6fc [ 241.880000] r4:c04da6fc [ 241.880000] [] (kswapd) from [] (kthread+0xe4/0xfc) [ 241.880000] r10:00000000 r9:00000000 r8:00000000 r7:c00927e0 r6:c04da6fc r5:00000000 [ 241.880000] r4:c3950780 [ 241.880000] [] (kthread) from [] (ret_from_fork+0x14/0x28) [ 241.880000] r7:00000000 r6:00000000 r5:c003d0d0 r4:c3950780

  1. If the vpnbridge is instantiated only once, and the vpn client is to be executed, a realloc error occurs:

/usr/bin/env LANG=en_US.UTF-8 ./vpnclient Fatal Error: InternalReAlloc: error: realloc() failed.

Aborted

This is on 4.4.36 LEDE trunk with a FS that looks like:

Filesystem Size Used Available Use% Mounted on /dev/root 7.4M 7.4M 0 100% /rom tmpfs 25.3M 2.7M 22.6M 11% /tmp tmpfs 25.3M 44.0K 25.2M 0% /tmp/root tmpfs 512.0K 0 512.0K 0% /dev /dev/mtdblock2 50.0M 2.2M 47.8M 4% /overlay overlayfs:/overlay 50.0M 2.2M 47.8M 4% /

Any assistance would be helpful.

aparcar commented 7 years ago

jow-:

It looks to me as if you're running out of ram.

aparcar commented 7 years ago

ronniebrash:

Agreed, however, top says there is 21MB free before executing the vpnbridge, and 18MB before the vpnclient.

aparcar commented 7 years ago

jow-:

And during the execution? Maybe there's also endian issues in softethervpn, causing it to allocate huge amounts of memory?

aparcar commented 7 years ago

ronniebrash:

I will investigate the execution, but you are correct - the memory consumption is what is causing this issue. I have removed portions of the applications such that I only need the vpncmd and vpnbridge applications; the device doesn't crash.

I also had to play with the sysctrl var for min_free_kbyres (lowered it to 2048, and have been experimenting with 1024) and this also is increasing system stability.

I may have to work with the gentlemen who provided the package softether to LEDE/Openwrt to help make it more modular to reduce the run-time size. He was only testing on MIPs.