Open aparcar opened 7 years ago
jow-:
It looks to me as if you're running out of ram.
ronniebrash:
Agreed, however, top says there is 21MB free before executing the vpnbridge, and 18MB before the vpnclient.
jow-:
And during the execution? Maybe there's also endian issues in softethervpn, causing it to allocate huge amounts of memory?
ronniebrash:
I will investigate the execution, but you are correct - the memory consumption is what is causing this issue. I have removed portions of the applications such that I only need the vpncmd and vpnbridge applications; the device doesn't crash.
I also had to play with the sysctrl var for min_free_kbyres (lowered it to 2048, and have been experimenting with 1024) and this also is increasing system stability.
I may have to work with the gentlemen who provided the package softether to LEDE/Openwrt to help make it more modular to reduce the run-time size. He was only testing on MIPs.
ronniebrash:
On a mostly vanilla Atmel at91sam9g20-ek with 32MB of RAM, we are seeing a couple of errors using the current package software version + patches.
Overall, it may be the cross compilation process, overlay or something else entirely.
1. If the vpnbridge is already started, a NAND-WAIT stacktrace results if the binary is executed to instantiate a second instance.
root@lede:/usr/libexec/softethervpn# /usr/bin/env LANG=en_US.UTF-8 vpncmd [ 241.880000] [ 241.880000] ================================= [ 241.880000] [ INFO: inconsistent lock state ] [ 241.880000] 4.4.36 #0 Not tainted [ 241.880000] --------------------------------- [ 241.880000] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage. [ 241.880000] kswapd0/155 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 241.880000] (&f->sem){+.+.?.}, at: [] jffs2_do_clear_inode+0x28/0x114
[ 241.880000] {RECLAIM_FS-ON-W} state was registered at:
[ 241.880000] [] lockdep_trace_alloc+0xf8/0x118
[ 241.880000] [] kmem_cache_alloc+0x28/0x144
[ 241.880000] [] jffs2_alloc_inode_cache+0x1c/0x28
[ 241.880000] [] jffs2_do_read_inode+0x13c/0x20c
[ 241.880000] [] jffs2_iget+0x74/0x324
[ 241.880000] [] jffs2_do_fill_super+0x190/0x274
[ 241.880000] [] jffs2_fill_super+0xf4/0x144
[ 241.880000] [] mount_mtd_aux+0x80/0x10c
[ 241.880000] [] mount_mtd_nr+0x7c/0x94
[ 241.880000] [] mount_mtd+0x1b0/0x210
[ 241.880000] [] jffs2_mount+0x1c/0x28
[ 241.880000] [] mount_fs+0x1c/0xac
[ 241.880000] [] vfs_kern_mount+0x5c/0x12c
[ 241.880000] [] do_mount+0xa04/0xbc8
[ 241.880000] [] SyS_mount+0x7c/0xa8
[ 241.880000] [] ret_fast_syscall+0x0/0x1c
[ 241.880000] irq event stamp: 45867
[ 241.880000] hardirqs last enabled at (45867): [] _raw_spin_unlock_irq+0x2c/0x6c
[ 241.880000] hardirqs last disabled at (45866): [] _raw_spin_lock_irq+0x28/0x70
[ 241.880000] softirqs last enabled at (45450): [] do_softirq+0x208/0x2a4
[ 241.880000] softirqs last disabled at (45431): [] irq_exit+0xa4/0xec
[ 241.880000]
[ 241.880000] other info that might help us debug this:
[ 241.880000] Possible unsafe locking scenario:
[ 241.880000]
[ 241.880000] CPU0
[ 241.880000] ----
[ 241.880000] lock(&f->sem);
[ 241.880000]
[ 241.880000] lock(&f->sem);
[ 241.880000]
[ 241.880000] DEADLOCK
[ 241.880000]
[ 241.880000] 2 locks held by kswapd0/155:
[ 241.880000] #0: (shrinker_rwsem){++++..}, at: [] shrink_slab.part.1.constprop.7+0x30/0x230
[ 241.880000] #1: (&type->s_umount_key#32){.+.+..}, at: [] trylock_super+0x1c/0x60
[ 241.880000]
[ 241.880000] stack backtrace:
[ 241.880000] CPU: 0 PID: 155 Comm: kswapd0 Not tainted 4.4.36 #0
[ 241.880000] Hardware name: Atmel AT91SAM9
[ 241.880000] Backtrace:
[ 241.880000] [] (dump_backtrace) from [] (show_stack+0x18/0x1c)
[ 241.880000] r7:0000000a r6:c38af4c0 r5:c38af0c0 r4:c068f314
[ 241.880000] [] (show_stack) from [] (dump_stack+0x20/0x28)
[ 241.880000] [] (dump_stack) from [] (print_usage_bug.part.11+0x228/0x290)
[ 241.880000] [] (print_usage_bug.part.11) from [] (mark_lock+0x404/0x63c)
[ 241.880000] r8:c004c558 r7:c38af0c0 r6:c38af4c0 r5:00000008 r4:0000000a
[ 241.880000] [] (mark_lock) from [] ( lock_acquire+0x97c/0x198c)
[ 241.880000] r10:c38af0c0 r9:c0667044 r8:c38af4c0 r7:00000001 r6:00000002 r5:c3675704
[ 241.880000] r4:0000025d r3:00000004
[ 241.880000] [] (__lock_acquire) from [] (lock_acquire+0x7c/0x9c)
[ 241.880000] r10:00000042 r9:c334c4c4 r8:c38af0c0 r7:c0ccdd8c r6:00000001 r5:60000013
[ 241.880000] r4:00000000
[ 241.880000] [] (lock_acquire) from [] (mutex_lock_nested+0x50/0x39c)
[ 241.880000] r6:c334d000 r5:c334d000 r4:c36756d0
[ 241.880000] [] (mutex_lock_nested) from [] (jffs2_do_clear_inode+0x28/0x114)
[ 241.880000] r10:00000042 r9:c334c4c4 r8:c334c000 r7:00000058 r6:c334d000 r5:c334d000
[ 241.880000] r4:c36756d0
[ 241.880000] [] (jffs2_do_clear_inode) from [] (jffs2_evict_inode+0x34/0x38)
[ 241.880000] r7:00000058 r6:c044ba88 r5:c334d000 r4:c3675730
[ 241.880000] [] (jffs2_evict_inode) from [] (evict+0xac/0x164)
[ 241.880000] r5:c367580c r4:c3675730
[ 241.880000] [] (evict) from [] (dispose_list+0x38/0x44)
[ 241.880000] r7:00000058 r6:00000063 r5:00000006 r4:c392dd58
[ 241.880000] [] (dispose_list) from [] (prune_icache_sb+0x4c/0x5c)
[ 241.880000] r5:00000006 r4:c392dd58
[ 241.880000] [] (prune_icache_sb) from [] (super_cache_scan+0x140/0x184)
[ 241.880000] r5:00000000 r4:c392ddd8
[ 241.880000] [] (super_cache_scan) from [] (shrink_slab.part.1.constprop.7+0x1ac/0x230)
[ 241.880000] r10:00000601 r9:00000000 r8:00000000 r7:00000400 r6:00000062 r5:c334c4c4
[ 241.880000] r4:000000a5
[ 241.880000] [] (shrink_slab.part.1.constprop.7) from [] (shrink_zone+0x5f0/0x640)
[ 241.880000] r10:00000004 r9:00000117 r8:c04da918 r7:00000000 r6:00000000 r5:c392ded4
[ 241.880000] r4:c04da6fc
[ 241.880000] [] (shrink_zone) from [] (kswapd+0x4ac/0x668)
[ 241.880000] r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c04da6fc
[ 241.880000] r4:c04da6fc
[ 241.880000] [] (kswapd) from [] (kthread+0xe4/0xfc)
[ 241.880000] r10:00000000 r9:00000000 r8:00000000 r7:c00927e0 r6:c04da6fc r5:00000000
[ 241.880000] r4:c3950780
[ 241.880000] [] (kthread) from [] (ret_from_fork+0x14/0x28)
[ 241.880000] r7:00000000 r6:00000000 r5:c003d0d0 r4:c3950780
/usr/bin/env LANG=en_US.UTF-8 ./vpnclient Fatal Error: InternalReAlloc: error: realloc() failed.
Aborted
This is on 4.4.36 LEDE trunk with a FS that looks like:
Filesystem Size Used Available Use% Mounted on /dev/root 7.4M 7.4M 0 100% /rom tmpfs 25.3M 2.7M 22.6M 11% /tmp tmpfs 25.3M 44.0K 25.2M 0% /tmp/root tmpfs 512.0K 0 512.0K 0% /dev /dev/mtdblock2 50.0M 2.2M 47.8M 4% /overlay overlayfs:/overlay 50.0M 2.2M 47.8M 4% /
Any assistance would be helpful.