platinasystems / go

Other
9 stars 68 forks source link

invalid memory address or nil pointer dereference" after 'goes start' #33

Closed donnlee closed 5 years ago

donnlee commented 7 years ago

While attempting to repro #30 , after 'goes start' vnet never responded to 'sho ip fib' and I saw this in syslog:

Mar 23 10:54:17 invader16 goes.vnetd[17471]: runtime error: invalid memory address or nil pointer dereference: goroutine 36 [running]:
Mar 23 10:54:17 invader16 goes.vnetd[17471]: runtime/debug.Stack(0xc420ca5a70, 0xa7c740, 0xc42000c060)
Mar 23 10:54:17 invader16 goes.vnetd[17471]:         /usr/local/go/src/runtime/debug/stack.go:24 +0x79
Mar 23 10:54:17 invader16 goes.vnetd[17471]: github.com/platinasystems/go/elib/loop.(*Loop).doEvent.func1(0xc420186000)
Mar 23 10:54:17 invader16 goes.vnetd[17471]:         /home/donn/workspace/go/src/github.com/platinasystems/go/elib/loop/event.go:128 +0x91
Mar 23 10:54:17 invader16 goes.vnetd[17471]: panic(0xa7c740, 0xc42000c060)
Mar 23 10:54:17 invader16 goes.vnetd[17471]:         /usr/local/go/src/runtime/panic.go:458 +0x243
Mar 23 10:54:17 invader16 goes.vnetd[17471]: github.com/platinasystems/go/vnet/ip.(*Main).AddDelNextHop(0xc42007c168, 0x100000017, 0x100000005, 0xc420af1c00)
Mar 23 10:54:17 invader16 goes.vnetd[17471]:         /home/donn/workspace/go/src/github.com/platinasystems/go/vnet/ip/adjacency.go:459 +0x4c8
Mar 23 10:54:17 invader16 goes.vnetd[17471]: github.com/platinasystems/go/vnet/ip4.(*Main).AddDelRouteNextHop(0xc42007c100, 0xc4212c6528, 0xc4212c6ef0, 0xc4208b9401, 0x50e02000a, 0x1)
Mar 23 10:54:17 invader16 goes.vnetd[17471]:         /home/donn/workspace/go/src/github.com/platinasystems/go/vnet/ip4/fib.go:429 +0x1d8
Mar 23 10:54:17 invader16 goes.vnetd[17471]: github.com/platinasystems/go/vnet/unix.(*Main).ip4RouteMsg(0xc4200d7200, 0xc42127e820, 0xc420ca5e01, 0x42f7ce, 0xbece50)
Mar 23 10:54:17 invader16 goes.vnetd[17471]:         /home/donn/workspace/go/src/github.com/platinasystems/go/vnet/unix/netlink.go:489 +0x1f2
Mar 23 10:54:17 invader16 goes.vnetd[17471]: github.com/platinasystems/go/vnet/unix.(*netlinkEvent).EventAction(0xc421153530)
Mar 23 10:54:17 invader16 goes.vnetd[17471]:         /home/donn/workspace/go/src/github.com/platinasystems/go/vnet/unix/netlink.go:301 +0x466
Mar 23 10:54:17 invader16 goes.vnetd[17471]: github.com/platinasystems/go/elib/loop.(*loopEvent).do(0xc421153560)
Mar 23 10:54:17 invader16 goes.vnetd[17471]:         /home/donn/workspace/go/src/github.com/platinasystems/go/elib/loop/event.go:118 +0x34
Mar 23 10:54:17 invader16 goes.vnetd[17471]: github.com/platinasystems/go/elib/loop.(*Loop).doEvent(0xc420186000, 0xc421153560)
Mar 23 10:54:17 invader16 goes.vnetd[17471]:         /home/donn/workspace/go/src/github.com/platinasystems/go/elib/loop/event.go:133 +0x51
Mar 23 10:54:17 invader16 goes.vnetd[17471]: github.com/platinasystems/go/elib/loop.(*Loop).eventHandler(0xc420186000, 0x128dee0, 0xc420186548)
Mar 23 10:54:17 invader16 goes.vnetd[17471]:         /home/donn/workspace/go/src/github.com/platinasystems/go/elib/loop/event.go:140 +0x86
Mar 23 10:54:17 invader16 goes.vnetd[17471]: created by github.com/platinasystems/go/elib/loop.(*Loop).startHandler
Mar 23 10:54:17 invader16 goes.vnetd[17471]:         /home/donn/workspace/go/src/github.com/platinasystems/go/elib/loop/event.go:153 +0x124
root@invader16:~# /usr/bin/goes version
7d6b2baf82c482a043ce4d875b91fefbbf344c79
commit 7d6b2baf82c482a043ce4d875b91fefbbf344c79
Author: Jason Pang <jason@platinasystems.com>
Date:   Wed Mar 22 21:14:58 2017 -0700

    Adding non-volatile power logging functions to BMC
robert-coulson commented 7 years ago

Good afternoon Donn,

yes, that looks like the 'ma' is set to null:

ma, mai := m.mpAdjForAdj(oldAdj, false) if ma.normalizedNextHops.size > 0

where m.mpAdjForAdj() contains: maIndex = uint(m.adjacencyHeap.Id(uint(a))) mm := &m.multipathMain if validate { mm.mpAdjVec.Validate(maIndex) } if maIndex < m.multipathMain.mpAdjVec.Len() { ma = &mm.mpAdjVec[maIndex] }

in this case, if ma is null, then the error above can occur.

Please file a bug.

thanks,

*** Rob.

On Thu, Mar 23, 2017 at 10:59 AM, Donn Lee notifications@github.com wrote:

While attempting to repro #30 https://github.com/platinasystems/go/issues/30 , after 'goes start' vnet never responded to 'sho ip fib' and I saw this in syslog:

Mar 23 10:54:17 invader16 goes.vnetd[17471]: runtime error: invalid memory address or nil pointer dereference: goroutine 36 [running]: Mar 23 10:54:17 invader16 goes.vnetd[17471]: runtime/debug.Stack(0xc420ca5a70, 0xa7c740, 0xc42000c060) Mar 23 10:54:17 invader16 goes.vnetd[17471]: /usr/local/go/src/runtime/debug/stack.go:24 +0x79 Mar 23 10:54:17 invader16 goes.vnetd[17471]: github.com/platinasystems/go/elib/loop.(Loop).doEvent.func1(0xc420186000) Mar 23 10:54:17 invader16 goes.vnetd[17471]: /home/donn/workspace/go/src/github.com/platinasystems/go/elib/loop/event.go:128 +0x91 Mar 23 10:54:17 invader16 goes.vnetd[17471]: panic(0xa7c740, 0xc42000c060) Mar 23 10:54:17 invader16 goes.vnetd[17471]: /usr/local/go/src/runtime/panic.go:458 +0x243 Mar 23 10:54:17 invader16 goes.vnetd[17471]: github.com/platinasystems/go/vnet/ip.(Main).AddDelNextHop(0xc42007c168, 0x100000017, 0x100000005, 0xc420af1c00) Mar 23 10:54:17 invader16 goes.vnetd[17471]: /home/donn/workspace/go/src/github.com/platinasystems/go/vnet/ip/adjacency.go:459 +0x4c8 Mar 23 10:54:17 invader16 goes.vnetd[17471]: github.com/platinasystems/go/vnet/ip4.(Main).AddDelRouteNextHop(0xc42007c100, 0xc4212c6528, 0xc4212c6ef0, 0xc4208b9401, 0x50e02000a, 0x1) Mar 23 10:54:17 invader16 goes.vnetd[17471]: /home/donn/workspace/go/src/github.com/platinasystems/go/vnet/ip4/fib.go:429 +0x1d8 Mar 23 10:54:17 invader16 goes.vnetd[17471]: github.com/platinasystems/go/vnet/unix.(Main).ip4RouteMsg(0xc4200d7200, 0xc42127e820, 0xc420ca5e01, 0x42f7ce, 0xbece50) Mar 23 10:54:17 invader16 goes.vnetd[17471]: /home/donn/workspace/go/src/github.com/platinasystems/go/vnet/unix/netlink.go:489 +0x1f2 Mar 23 10:54:17 invader16 goes.vnetd[17471]: github.com/platinasystems/go/vnet/unix.(netlinkEvent).EventAction(0xc421153530) Mar 23 10:54:17 invader16 goes.vnetd[17471]: /home/donn/workspace/go/src/github.com/platinasystems/go/vnet/unix/netlink.go:301 +0x466 Mar 23 10:54:17 invader16 goes.vnetd[17471]: github.com/platinasystems/go/elib/loop.(loopEvent).do(0xc421153560) Mar 23 10:54:17 invader16 goes.vnetd[17471]: /home/donn/workspace/go/src/github.com/platinasystems/go/elib/loop/event.go:118 +0x34 Mar 23 10:54:17 invader16 goes.vnetd[17471]: github.com/platinasystems/go/elib/loop.(Loop).doEvent(0xc420186000, 0xc421153560) Mar 23 10:54:17 invader16 goes.vnetd[17471]: /home/donn/workspace/go/src/github.com/platinasystems/go/elib/loop/event.go:133 +0x51 Mar 23 10:54:17 invader16 goes.vnetd[17471]: github.com/platinasystems/go/elib/loop.(Loop).eventHandler(0xc420186000, 0x128dee0, 0xc420186548) Mar 23 10:54:17 invader16 goes.vnetd[17471]: /home/donn/workspace/go/src/github.com/platinasystems/go/elib/loop/event.go:140 +0x86 Mar 23 10:54:17 invader16 goes.vnetd[17471]: created by github.com/platinasystems/go/elib/loop.(*Loop).startHandler Mar 23 10:54:17 invader16 goes.vnetd[17471]: /home/donn/workspace/go/src/github.com/platinasystems/go/elib/loop/event.go:153 +0x124

root@invader16:~# /usr/bin/goes version 7d6b2baf82c482a043ce4d875b91fefbbf344c79

commit 7d6b2baf82c482a043ce4d875b91fefbbf344c79 Author: Jason Pang jason@platinasystems.com Date: Wed Mar 22 21:14:58 2017 -0700

Adding non-volatile power logging functions to BMC

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/platinasystems/go/issues/33, or mute the thread https://github.com/notifications/unsubscribe-auth/APaRmnBdYXu_jLpG1xPsaJCdYhuAvzexks5rorMMgaJpZM4MnI8q .

donnlee commented 7 years ago

Just encountered this again at goes start:


Mar 28 12:47:52 invader16 goes.vnetd[28216]: runtime error: invalid memory address or nil pointer dereference: goroutine 36 [running]:
Mar 28 12:47:52 invader16 goes.vnetd[28216]: runtime/debug.Stack(0xc420ca9a70, 0xa4a100, 0xc42000c060)
Mar 28 12:47:52 invader16 goes.vnetd[28216]:         /usr/local/go/src/runtime/debug/stack.go:24 +0x79
Mar 28 12:47:52 invader16 goes.vnetd[28216]: github.com/platinasystems/go/elib/loop.(*Loop).doEvent.func1(0xc420194000)
Mar 28 12:47:52 invader16 goes.vnetd[28216]:         /home/fyang/gopath/src/github.com/platinasystems/go/elib/loop/event.go:128 +0x91
Mar 28 12:47:52 invader16 goes.vnetd[28216]: panic(0xa4a100, 0xc42000c060)
Mar 28 12:47:52 invader16 goes.vnetd[28216]:         /usr/local/go/src/runtime/panic.go:458 +0x243
Mar 28 12:47:52 invader16 goes.vnetd[28216]: github.com/platinasystems/go/vnet/ip.(*Main).AddDelNextHop(0xc420190768, 0x100000015, 0x100000003, 0xc420765750)
Mar 28 12:47:52 invader16 goes.vnetd[28216]:         /home/fyang/gopath/src/github.com/platinasystems/go/vnet/ip/adjacency.go:456 +0x4c8
Mar 28 12:47:52 invader16 goes.vnetd[28216]: github.com/platinasystems/go/vnet/ip4.(*Main).AddDelRouteNextHop(0xc420190700, 0xc4212c0000, 0xc4212c0020, 0xc42045a801, 0x50e02000a, 0xc400000001)
Mar 28 12:47:52 invader16 goes.vnetd[28216]:         /home/fyang/gopath/src/github.com/platinasystems/go/vnet/ip4/fib.go:429 +0x1d8
Mar 28 12:47:52 invader16 goes.vnetd[28216]: github.com/platinasystems/go/vnet/unix.(*Main).ip4RouteMsg(0xc420172d80, 0xc4213c0000, 0xc420ca9e00, 0x42efde, 0xbb03e0)
Mar 28 12:47:52 invader16 goes.vnetd[28216]:         /home/fyang/gopath/src/github.com/platinasystems/go/vnet/unix/netlink.go:404 +0x1f2
Mar 28 12:47:52 invader16 goes.vnetd[28216]: github.com/platinasystems/go/vnet/unix.(*netlinkEvent).EventAction(0xc42123afc0)
Mar 28 12:47:52 invader16 goes.vnetd[28216]:         /home/fyang/gopath/src/github.com/platinasystems/go/vnet/unix/netlink.go:232 +0x466
Mar 28 12:47:52 invader16 goes.vnetd[28216]: github.com/platinasystems/go/elib/loop.(*loopEvent).do(0xc42127bd70)
Mar 28 12:47:52 invader16 goes.vnetd[28216]:         /home/fyang/gopath/src/github.com/platinasystems/go/elib/loop/event.go:118 +0x34
Mar 28 12:47:52 invader16 goes.vnetd[28216]: github.com/platinasystems/go/elib/loop.(*Loop).doEvent(0xc420194000, 0xc42127bd70)
Mar 28 12:47:52 invader16 goes.vnetd[28216]:         /home/fyang/gopath/src/github.com/platinasystems/go/elib/loop/event.go:133 +0x51
Mar 28 12:47:52 invader16 goes.vnetd[28216]: github.com/platinasystems/go/elib/loop.(*Loop).eventHandler(0xc420194000, 0x1238060, 0xc420194548)
Mar 28 12:47:52 invader16 goes.vnetd[28216]:         /home/fyang/gopath/src/github.com/platinasystems/go/elib/loop/event.go:140 +0x86
Mar 28 12:47:52 invader16 goes.vnetd[28216]: created by github.com/platinasystems/go/elib/loop.(*Loop).startHandler
Mar 28 12:47:52 invader16 goes.vnetd[28216]:         /home/fyang/gopath/src/github.com/platinasystems/go/elib/loop/event.go:153 +0x124
donnlee commented 7 years ago

We think this is part of "restart bugs"

donnlee commented 7 years ago

Marking low-pri. Just keeping a lookout for it in the future.

donnlee commented 7 years ago

I got this again after 'goes-platina-mk1 install':

Steps:

Grab today's build (5:30am) from Jenkins build machine.

donn@invader16:~$ sudo ./goes-platina-mk1.0524 install
[sudo] password for donn:
SIOCADDRT: File exists
donn@invader16:~$ ps -e f
...
 3881 ?        Ss     0:00 /lib/systemd/systemd --user
 3884 ?        S      0:00  \_ (sd-pam)
 4518 ?        Ssl    0:00 goes-daemons
 4524 ?        Sl     0:00  \_ redisd
 4537 ?        Sl     0:00  \_ qsfp
 4543 ?        Sl     0:00  \_ uptimed
 4546 ?        Sl     0:00  \_ i2cd
donn@invader16:~$ sudo less /var/log/syslog

May 24 16:19:46 invader16 goes.vnetd[4539]: runtime error: invalid memory address or nil pointer dereference: goroutine 68 [running]:
May 24 16:19:46 invader16 goes.vnetd[4539]: runtime/debug.Stack(0xc420ad1a78, 0xadb100, 0x1202430)
May 24 16:19:46 invader16 goes.vnetd[4539]:         /usr/local/go/src/runtime/debug/stack.go:24 +0x79
May 24 16:19:46 invader16 goes.vnetd[4539]: github.com/platinasystems/go/elib/loop.(*Loop).doEvent.func1(0xc42019a000)
May 24 16:19:46 invader16 goes.vnetd[4539]:         /home/jenkins/workspace/go/src/github.com/platinasystems/go/elib/loop/event.go:128 +0x72
May 24 16:19:46 invader16 goes.vnetd[4539]: panic(0xadb100, 0x1202430)
May 24 16:19:46 invader16 goes.vnetd[4539]:         /usr/local/go/src/runtime/panic.go:489 +0x2cf
May 24 16:19:46 invader16 goes.vnetd[4539]: github.com/platinasystems/go/vnet/ip.(*Main).AddDelNextHop(0xc4200df668, 0x100000083, 0x100000072, 0xc4200d7390)
May 24 16:19:46 invader16 goes.vnetd[4539]:         /home/jenkins/workspace/go/src/github.com/platinasystems/go/vnet/ip/adjacency.go:474 +0x4c2
May 24 16:19:46 invader16 goes.vnetd[4539]: github.com/platinasystems/go/vnet/ip4.(*Main).AddDelRouteNextHop(0xc4200df600, 0xc420f722d8, 0xc420f722e0, 0xe020001, 0xc400000001, 0xed)
May 24 16:19:46 invader16 goes.vnetd[4539]:         /home/jenkins/workspace/go/src/github.com/platinasystems/go/vnet/ip4/fib.go:429 +0x1bb
May 24 16:19:46 invader16 goes.vnetd[4539]: github.com/platinasystems/go/vnet/unix.(*Main).ip4RouteMsg(0xc4201d4500, 0xc4208a9520, 0xc420ad1e01, 0x1, 0x1)
May 24 16:19:46 invader16 goes.vnetd[4539]:         /home/jenkins/workspace/go/src/github.com/platinasystems/go/vnet/unix/netlink.go:550 +0x233
May 24 16:19:46 invader16 goes.vnetd[4539]: github.com/platinasystems/go/vnet/unix.(*netlinkEvent).EventAction(0xc420848420)
May 24 16:19:46 invader16 goes.vnetd[4539]:         /home/jenkins/workspace/go/src/github.com/platinasystems/go/vnet/unix/netlink.go:353 +0x5a4
May 24 16:19:46 invader16 goes.vnetd[4539]: github.com/platinasystems/go/elib/loop.(*loopEvent).do(0xc420a77cb0)
May 24 16:19:46 invader16 goes.vnetd[4539]:         /home/jenkins/workspace/go/src/github.com/platinasystems/go/elib/loop/event.go:118 +0x34
May 24 16:19:46 invader16 goes.vnetd[4539]: github.com/platinasystems/go/elib/loop.(*Loop).doEvent(0xc42019a000, 0xc420a77cb0)
May 24 16:19:46 invader16 goes.vnetd[4539]:         /home/jenkins/workspace/go/src/github.com/platinasystems/go/elib/loop/event.go:133 +0x51
May 24 16:19:46 invader16 goes.vnetd[4539]: github.com/platinasystems/go/elib/loop.(*Loop).eventHandler(0xc42019a000, 0x131f960, 0xc42019a548)
May 24 16:19:46 invader16 goes.vnetd[4539]:         /home/jenkins/workspace/go/src/github.com/platinasystems/go/elib/loop/event.go:140 +0x86
May 24 16:19:46 invader16 goes.vnetd[4539]: created by github.com/platinasystems/go/elib/loop.(*Loop).startHandler
May 24 16:19:46 invader16 goes.vnetd[4539]:         /home/jenkins/workspace/go/src/github.com/platinasystems/go/elib/loop/event.go:153 +0x124
May 24 16:19:46 invader16 goes.vnetd[4539]: done
donnlee commented 7 years ago

I was able to workaround the panic by NOP'ing my /etc/goes/start:

donn@invader16:~$ sudo mv /etc/goes/start /etc/goes/start.orig

donn@invader16:~$ sudo goes restart

my /etc/goes/start:


donn@invader16:~$ cat /etc/goes/start.orig
#!/usr/bin/goes

hset -q platina vnet.eth-1-1.media copper
hset -q platina vnet.eth-1-1.speed auto

hset -q platina vnet.eth-2-1.media copper
hset -q platina vnet.eth-2-1.speed auto

hset -q platina vnet.eth-3-1.media copper
hset -q platina vnet.eth-3-1.speed auto

hset -q platina vnet.eth-4-1.media copper
hset -q platina vnet.eth-4-1.speed auto

hset -q platina vnet.eth-5-1.media copper
hset -q platina vnet.eth-5-1.speed auto

hset -q platina vnet.eth-6-1.media copper
hset -q platina vnet.eth-6-1.speed auto

hset -q platina vnet.eth-7-1.media copper
hset -q platina vnet.eth-7-1.speed auto

hset -q platina vnet.eth-8-1.media copper
hset -q platina vnet.eth-8-1.speed auto

hset -q platina vnet.eth-9-1.media copper
hset -q platina vnet.eth-9-1.speed auto

hset -q platina vnet.eth-10-1.media copper
hset -q platina vnet.eth-10-1.speed auto

hset -q platina vnet.eth-11-1.media copper
hset -q platina vnet.eth-11-1.speed auto

hset -q platina vnet.eth-12-1.media copper
hset -q platina vnet.eth-12-1.speed auto

hset -q platina vnet.eth-13-1.media copper
hset -q platina vnet.eth-13-1.speed auto

hset -q platina vnet.eth-14-1.media copper
hset -q platina vnet.eth-14-1.speed auto

hset -q platina vnet.eth-15-1.media copper
hset -q platina vnet.eth-15-1.speed auto

hset -q platina vnet.eth-16-1.media copper
hset -q platina vnet.eth-16-1.speed auto

hset -q platina vnet.eth-17-1.media copper
hset -q platina vnet.eth-17-1.speed 100g

hset -q platina vnet.eth-18-1.media copper
hset -q platina vnet.eth-18-1.speed 100g

hset -q platina vnet.eth-19-1.media copper
hset -q platina vnet.eth-19-1.speed 100g

hset -q platina vnet.eth-20-1.media copper
hset -q platina vnet.eth-20-1.speed 100g

hset -q platina vnet.eth-21-1.media copper
hset -q platina vnet.eth-21-1.speed 100g

hset -q platina vnet.eth-22-1.media copper
hset -q platina vnet.eth-22-1.speed 100g

hset -q platina vnet.eth-23-1.media copper
hset -q platina vnet.eth-23-1.speed 100g

hset -q platina vnet.eth-24-1.media copper
hset -q platina vnet.eth-24-1.speed 100g

hset -q platina vnet.eth-25-1.media copper
hset -q platina vnet.eth-25-1.speed 100g

hset -q platina vnet.eth-26-1.media copper
hset -q platina vnet.eth-26-1.speed 100g

hset -q platina vnet.eth-27-1.media copper
hset -q platina vnet.eth-27-1.speed 100g

hset -q platina vnet.eth-28-1.media copper
hset -q platina vnet.eth-28-1.speed 100g

hset -q platina vnet.eth-29-1.media copper
hset -q platina vnet.eth-29-1.speed 100g

hset -q platina vnet.eth-30-1.media copper
hset -q platina vnet.eth-30-1.speed 100g

hset -q platina vnet.eth-31-1.media copper
hset -q platina vnet.eth-31-1.speed 100g

hset -q platina vnet.eth-32-1.media copper
hset -q platina vnet.eth-32-1.speed 100g

! ifup --allow vnet -a
donn@invader16:~$
stigt commented 7 years ago

When I restart goes with Donn's start file it doesn't crash vnet, but I do get the following errors from the start file:

root@invader1:/home/stig/bin# goes restart hset: ERROR invalid speed hset: ERROR invalid speed hset: ERROR invalid speed hset: ERROR invalid speed hset: ERROR invalid speed hset: ERROR invalid speed hset: ERROR invalid speed hset: ERROR invalid speed hset: ERROR invalid speed hset: ERROR invalid speed hset: ERROR invalid speed hset: ERROR invalid speed hset: ERROR invalid speed hset: ERROR invalid speed hset: ERROR invalid speed hset: ERROR can't set eth-32-1.media to copper hset: ERROR can't set eth-32-1.speed to 100g root@invader1:/home/stig/bin# ~stig/goes-ok deamons: OK redis : OK vnetd : OK

2017-05-24 16:42 GMT-07:00 Donn Lee notifications@github.com:

I was able to workaround the panic by NOP'ing my /etc/goes/start:

donn@invader16:~$ sudo mv /etc/goes/start /etc/goes/start.orig

donn@invader16:~$ sudo goes restart

my /etc/goes/start:

donn@invader16:~$ cat /etc/goes/start.orig

!/usr/bin/goes

hset -q platina vnet.eth-1-1.media copper hset -q platina vnet.eth-1-1.speed auto

hset -q platina vnet.eth-2-1.media copper hset -q platina vnet.eth-2-1.speed auto

hset -q platina vnet.eth-3-1.media copper hset -q platina vnet.eth-3-1.speed auto

hset -q platina vnet.eth-4-1.media copper hset -q platina vnet.eth-4-1.speed auto

hset -q platina vnet.eth-5-1.media copper hset -q platina vnet.eth-5-1.speed auto

hset -q platina vnet.eth-6-1.media copper hset -q platina vnet.eth-6-1.speed auto

hset -q platina vnet.eth-7-1.media copper hset -q platina vnet.eth-7-1.speed auto

hset -q platina vnet.eth-8-1.media copper hset -q platina vnet.eth-8-1.speed auto

hset -q platina vnet.eth-9-1.media copper hset -q platina vnet.eth-9-1.speed auto

hset -q platina vnet.eth-10-1.media copper hset -q platina vnet.eth-10-1.speed auto

hset -q platina vnet.eth-11-1.media copper hset -q platina vnet.eth-11-1.speed auto

hset -q platina vnet.eth-12-1.media copper hset -q platina vnet.eth-12-1.speed auto

hset -q platina vnet.eth-13-1.media copper hset -q platina vnet.eth-13-1.speed auto

hset -q platina vnet.eth-14-1.media copper hset -q platina vnet.eth-14-1.speed auto

hset -q platina vnet.eth-15-1.media copper hset -q platina vnet.eth-15-1.speed auto

hset -q platina vnet.eth-16-1.media copper hset -q platina vnet.eth-16-1.speed auto

hset -q platina vnet.eth-17-1.media copper hset -q platina vnet.eth-17-1.speed 100g

hset -q platina vnet.eth-18-1.media copper hset -q platina vnet.eth-18-1.speed 100g

hset -q platina vnet.eth-19-1.media copper hset -q platina vnet.eth-19-1.speed 100g

hset -q platina vnet.eth-20-1.media copper hset -q platina vnet.eth-20-1.speed 100g

hset -q platina vnet.eth-21-1.media copper hset -q platina vnet.eth-21-1.speed 100g

hset -q platina vnet.eth-22-1.media copper hset -q platina vnet.eth-22-1.speed 100g

hset -q platina vnet.eth-23-1.media copper hset -q platina vnet.eth-23-1.speed 100g

hset -q platina vnet.eth-24-1.media copper hset -q platina vnet.eth-24-1.speed 100g

hset -q platina vnet.eth-25-1.media copper hset -q platina vnet.eth-25-1.speed 100g

hset -q platina vnet.eth-26-1.media copper hset -q platina vnet.eth-26-1.speed 100g

hset -q platina vnet.eth-27-1.media copper hset -q platina vnet.eth-27-1.speed 100g

hset -q platina vnet.eth-28-1.media copper hset -q platina vnet.eth-28-1.speed 100g

hset -q platina vnet.eth-29-1.media copper hset -q platina vnet.eth-29-1.speed 100g

hset -q platina vnet.eth-30-1.media copper hset -q platina vnet.eth-30-1.speed 100g

hset -q platina vnet.eth-31-1.media copper hset -q platina vnet.eth-31-1.speed 100g

hset -q platina vnet.eth-32-1.media copper hset -q platina vnet.eth-32-1.speed 100g

! ifup --allow vnet -a donn@invader16:~$

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/platinasystems/go/issues/33#issuecomment-303883081, or mute the thread https://github.com/notifications/unsubscribe-auth/AAQZoL38iEgYyajn9l11Er1ChA7f-VWVks5r9MCBgaJpZM4MnI8q .

stigt commented 7 years ago

Never mind, those errors look like they're related to my zero based interfaces.

On Wed, May 24, 2017 at 4:47 PM, Stig Thormodsrud stig@platinasystems.com wrote:

When I restart goes with Donn's start file it doesn't crash vnet, but I do get the following errors from the start file:

root@invader1:/home/stig/bin# goes restart hset: ERROR invalid speed hset: ERROR invalid speed hset: ERROR invalid speed hset: ERROR invalid speed hset: ERROR invalid speed hset: ERROR invalid speed hset: ERROR invalid speed hset: ERROR invalid speed hset: ERROR invalid speed hset: ERROR invalid speed hset: ERROR invalid speed hset: ERROR invalid speed hset: ERROR invalid speed hset: ERROR invalid speed hset: ERROR invalid speed hset: ERROR can't set eth-32-1.media to copper hset: ERROR can't set eth-32-1.speed to 100g root@invader1:/home/stig/bin# ~stig/goes-ok deamons: OK redis : OK vnetd : OK

2017-05-24 16:42 GMT-07:00 Donn Lee notifications@github.com:

I was able to workaround the panic by NOP'ing my /etc/goes/start:

donn@invader16:~$ sudo mv /etc/goes/start /etc/goes/start.orig

donn@invader16:~$ sudo goes restart

my /etc/goes/start:

donn@invader16:~$ cat /etc/goes/start.orig

!/usr/bin/goes

hset -q platina vnet.eth-1-1.media copper hset -q platina vnet.eth-1-1.speed auto

hset -q platina vnet.eth-2-1.media copper hset -q platina vnet.eth-2-1.speed auto

hset -q platina vnet.eth-3-1.media copper hset -q platina vnet.eth-3-1.speed auto

hset -q platina vnet.eth-4-1.media copper hset -q platina vnet.eth-4-1.speed auto

hset -q platina vnet.eth-5-1.media copper hset -q platina vnet.eth-5-1.speed auto

hset -q platina vnet.eth-6-1.media copper hset -q platina vnet.eth-6-1.speed auto

hset -q platina vnet.eth-7-1.media copper hset -q platina vnet.eth-7-1.speed auto

hset -q platina vnet.eth-8-1.media copper hset -q platina vnet.eth-8-1.speed auto

hset -q platina vnet.eth-9-1.media copper hset -q platina vnet.eth-9-1.speed auto

hset -q platina vnet.eth-10-1.media copper hset -q platina vnet.eth-10-1.speed auto

hset -q platina vnet.eth-11-1.media copper hset -q platina vnet.eth-11-1.speed auto

hset -q platina vnet.eth-12-1.media copper hset -q platina vnet.eth-12-1.speed auto

hset -q platina vnet.eth-13-1.media copper hset -q platina vnet.eth-13-1.speed auto

hset -q platina vnet.eth-14-1.media copper hset -q platina vnet.eth-14-1.speed auto

hset -q platina vnet.eth-15-1.media copper hset -q platina vnet.eth-15-1.speed auto

hset -q platina vnet.eth-16-1.media copper hset -q platina vnet.eth-16-1.speed auto

hset -q platina vnet.eth-17-1.media copper hset -q platina vnet.eth-17-1.speed 100g

hset -q platina vnet.eth-18-1.media copper hset -q platina vnet.eth-18-1.speed 100g

hset -q platina vnet.eth-19-1.media copper hset -q platina vnet.eth-19-1.speed 100g

hset -q platina vnet.eth-20-1.media copper hset -q platina vnet.eth-20-1.speed 100g

hset -q platina vnet.eth-21-1.media copper hset -q platina vnet.eth-21-1.speed 100g

hset -q platina vnet.eth-22-1.media copper hset -q platina vnet.eth-22-1.speed 100g

hset -q platina vnet.eth-23-1.media copper hset -q platina vnet.eth-23-1.speed 100g

hset -q platina vnet.eth-24-1.media copper hset -q platina vnet.eth-24-1.speed 100g

hset -q platina vnet.eth-25-1.media copper hset -q platina vnet.eth-25-1.speed 100g

hset -q platina vnet.eth-26-1.media copper hset -q platina vnet.eth-26-1.speed 100g

hset -q platina vnet.eth-27-1.media copper hset -q platina vnet.eth-27-1.speed 100g

hset -q platina vnet.eth-28-1.media copper hset -q platina vnet.eth-28-1.speed 100g

hset -q platina vnet.eth-29-1.media copper hset -q platina vnet.eth-29-1.speed 100g

hset -q platina vnet.eth-30-1.media copper hset -q platina vnet.eth-30-1.speed 100g

hset -q platina vnet.eth-31-1.media copper hset -q platina vnet.eth-31-1.speed 100g

hset -q platina vnet.eth-32-1.media copper hset -q platina vnet.eth-32-1.speed 100g

! ifup --allow vnet -a donn@invader16:~$

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/platinasystems/go/issues/33#issuecomment-303883081, or mute the thread https://github.com/notifications/unsubscribe-auth/AAQZoL38iEgYyajn9l11Er1ChA7f-VWVks5r9MCBgaJpZM4MnI8q .

donnlee commented 7 years ago

I did a power cycle (Stig's suggestion) and this yielded exactly 1 clean goes-start using the original /etc/goes/start file. Subsequent restarts failed with same panic. I also loaded Stig's invader1 goes onto my invader (i16).

Moreover, the workaround doesn't really work. Yes, vnet stays up, but after manually enabling two (or three) eth-x-y interfaces, links fail to go link-up and vnet crashes after sho fe1 po phy. Here's one showing addDelReplace lines before the panic:

May 24 17:10:30 invader16 goes.vnetd[19378]: addDelReplace 10.0.4.16/24 42 false
May 24 17:10:30 invader16 goes.vnetd[19378]: addDelReplace 10.0.4.16/32 22 false
May 24 17:10:30 invader16 goes.vnetd[19378]: runtime error: invalid memory address or nil pointer dereference: goroutine 52 [running]:
May 24 17:10:30 invader16 goes.vnetd[19378]: runtime/debug.Stack(0xc420adda98, 0xadb2c0, 0x1000fb0)
May 24 17:10:30 invader16 goes.vnetd[19378]:         /usr/local/go/src/runtime/debug/stack.go:24 +0x79
May 24 17:10:30 invader16 goes.vnetd[19378]: github.com/platinasystems/go/elib/loop.(*Loop).doEvent.func1(0xc42019e000)
May 24 17:10:30 invader16 goes.vnetd[19378]:         /home/stig/go/src/github.com/platinasystems/go/elib/loop/event.go:128 +0x72
May 24 17:10:30 invader16 goes.vnetd[19378]: panic(0xadb2c0, 0x1000fb0)
May 24 17:10:30 invader16 goes.vnetd[19378]:         /usr/local/go/src/runtime/panic.go:489 +0x2cf
May 24 17:10:30 invader16 goes.vnetd[19378]: github.com/platinasystems/go/vnet/ip.(*Main).AddDelNextHop(0xc4200e7668, 0x10000002a, 0x100000015, 0x
c4212dc898)
May 24 17:10:30 invader16 goes.vnetd[19378]:         /home/stig/go/src/github.com/platinasystems/go/vnet/ip/adjacency.go:474 +0x4c2
May 24 17:10:30 invader16 goes.vnetd[19378]: github.com/platinasystems/go/vnet/ip4.(*Main).AddDelRouteNextHop(0xc4200e7600, 0xc420ebb290, 0xc420eb
b2a0, 0xe020001, 0x1, 0x20)
May 24 17:10:30 invader16 goes.vnetd[19378]:         /home/stig/go/src/github.com/platinasystems/go/vnet/ip4/fib.go:432 +0x1bb
May 24 17:10:30 invader16 goes.vnetd[19378]: github.com/platinasystems/go/vnet/unix.(*Main).ip4RouteMsg(0xc4201f6500, 0xc4209e4ea0, 0xc420498b01,
0xc420addea8, 0x42c3de)
May 24 17:10:30 invader16 goes.vnetd[19378]:         /home/stig/go/src/github.com/platinasystems/go/vnet/unix/netlink.go:550 +0x233

The console displays DMAR messages every time:

donn@invader16:~$ sudo goes restart
DMAR: DRHD: handling fault status reg 202
DMAR: [DMA Write] Request device [04:00.0] fault addr ffd4c000 [fault reason 05] PTE Write access is not set
SIOCADDRT: File exists
donn@invader16:~$ DMAR: DRHD: handling fault status reg 302
DMAR: [DMA Write] Request device [04:00.0] fault addr ffd4c000 [fault reason 05] PTE Write access is not set

donn@invader16:~$
donn@invader16:~$
donn@invader16:~$ sudo goes restart
SIOCADDRT: File exists
donn@invader16:~$ DMAR: DRHD: handling fault status reg 402
DMAR: [DMA Write] Request device [04:00.0] fault addr ffd54000 [fault reason 05] PTE Write access is not set

donn@invader16:~$
donn@invader16:~$ DMAR: DRHD: handling fault status reg 502
DMAR: [DMA Write] Request device [04:00.0] fault addr ffd43000 [fault reason 05] PTE Write access is not set
DMAR: DRHD: handling fault status reg 602
DMAR: [DMA Write] Request device [04:00.0] fault addr ffd43000 [fault reason 05] PTE Write access is not set

When bringing up interfaces manually, the 2nd or 3rd interface fails to go link-up and then vnet crashes after I check the link status with vnet show fe1 po phy

donn@invader16:~$ goes vnet show fe1 po phy
vnet: /run/goes/socks/vnet: timeout
donn@invader16:~$
donnlee commented 7 years ago

I just tried the same kernel 4.11 upgrade and goes upgrade on i17: i17 is ok.

Moreover, I looked at the output from when I opened this issue (March 23), and see that the issue was reported on i16. Today's problems are also on i16.

So maybe i16 has bad hardware.

@jasonlpang @fszyang : Possible to do diagnosis on i16?

jasonlpang commented 7 years ago

Hey Donn, Is it ok if I take a look at i16? I may reboot/power cycle it.

thanks Jason

On May 24, 2017, at 5:48 PM, Donn Lee notifications@github.com wrote:

I just tried the same kernel 4.11 upgrade and goes upgrade on i17: i17 is ok.

Moreover, I looked at the output from when I opened this issue (March 23), and see that the issue was reported on i16. Today's problems are also on i16.

So maybe i16 has bad hardware.

@jasonlpang https://github.com/jasonlpang @fszyang https://github.com/fszyang : Possible to do diagnosis on i16?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/platinasystems/go/issues/33#issuecomment-303892061, or mute the thread https://github.com/notifications/unsubscribe-auth/AQPuuQdQhKF2BW5kgUpAeJrDq5n2D2Ttks5r9M_wgaJpZM4MnI8q.

jasonlpang commented 7 years ago

So far there doesn’t seem to be any HW issues with i16. Some things I’ve checked:

I’ll try running a DRAM memory test tomorrow.

Does anyone know if the panic log gives any indication of what the issue is? Looks like SW is trying to access an invalid address? Is that a memory address or pcie address in the TH? May 24 23:35:40 invader16 goes.vnetd[5831]: runtime error: invalid memory address or nil pointer dereference: goroutine 40 [running]: May 24 23:35:40 invader16 goes.vnetd[5831]: runtime/debug.Stack(0xc420043a78, 0xadb0e0, 0x1201430) May 24 23:35:40 invader16 goes.vnetd[5831]: /usr/local/go/src/runtime/debug/stack.go:24 +0x79 May 24 23:35:40 invader16 goes.vnetd[5831]: github.com/platinasystems/go/elib/loop.(Loop).doEvent.func1(0xc42017a000) May 24 23:35:40 invader16 goes.vnetd[5831]: /home/jpang/gopath/src/github.com/platinasystems/go/elib/loop/event.go:128 +0x72 May 24 23:35:40 invader16 goes.vnetd[5831]: panic(0xadb0e0, 0x1201430) May 24 23:35:40 invader16 goes.vnetd[5831]: /usr/local/go/src/runtime/panic.go:489 +0x2cf May 24 23:35:40 invader16 goes.vnetd[5831]: github.com/platinasystems/go/vnet/ip.(Main).AddDelNextHop(0xc4201d0068, 0x100000083, 0x10000007d, 0xc4200dfb18) May 24 23:35:40 invader16 goes.vnetd[5831]: /home/jpang/gopath/src/github.com/platinasystems/go/vnet/ip/adjacency.go:474 +0x4c2 May 24 23:35:40 invader16 goes.vnetd[5831]: github.com/platinasystems/go/vnet/ip4.(Main).AddDelRouteNextHop(0xc4201d0000, 0xc420c4afa0, 0xc420c4afb0, 0xe020001, 0xc400000001, 0xef) May 24 23:35:40 invader16 goes.vnetd[5831]: /home/jpang/gopath/src/github.com/platinasystems/go/vnet/ip4/fib.go:429 +0x1bb May 24 23:35:40 invader16 goes.vnetd[5831]: github.com/platinasystems/go/vnet/unix.(Main).ip4RouteMsg(0xc4201d8500, 0xc42062d520, 0xc420043e01, 0x1, 0x1) May 24 23:35:40 invader16 goes.vnetd[5831]: /home/jpang/gopath/src/github.com/platinasystems/go/vnet/unix/netlink.go:550 +0x233 May 24 23:35:40 invader16 goes.vnetd[5831]: github.com/platinasystems/go/vnet/unix.(netlinkEvent).EventAction(0xc4203ddad0) May 24 23:35:40 invader16 goes.vnetd[5831]: /home/jpang/gopath/src/github.com/platinasystems/go/vnet/unix/netlink.go:353 +0x5a4 May 24 23:35:40 invader16 goes.vnetd[5831]: github.com/platinasystems/go/elib/loop.(loopEvent).do(0xc420522810) May 24 23:35:40 invader16 goes.vnetd[5831]: /home/jpang/gopath/src/github.com/platinasystems/go/elib/loop/event.go:118 +0x34 May 24 23:35:40 invader16 goes.vnetd[5831]: github.com/platinasystems/go/elib/loop.(Loop).doEvent(0xc42017a000, 0xc420522810) May 24 23:35:40 invader16 goes.vnetd[5831]: /home/jpang/gopath/src/github.com/platinasystems/go/elib/loop/event.go:133 +0x51 May 24 23:35:40 invader16 goes.vnetd[5831]: github.com/platinasystems/go/elib/loop.(Loop).eventHandler(0xc42017a000, 0x131e960, 0xc42017a548) May 24 23:35:40 invader16 goes.vnetd[5831]: /home/jpang/gopath/src/github.com/platinasystems/go/elib/loop/event.go:140 +0x86 May 24 23:35:40 invader16 goes.vnetd[5831]: created by github.com/platinasystems/go/elib/loop.(*Loop).startHandler May 24 23:35:40 invader16 goes.vnetd[5831]: /home/jpang/gopath/src/github.com/platinasystems/go/elib/loop/event.go:153 +0x124 May 24 23:35:41 invader16 kernel: DMAR: DRHD: handling fault status reg 102 May 24 23:35:41 invader16 kernel: DMAR: [DMA Write] Request device [04:00.0] fault addr ffd4c000 [fault reason 05] PTE Write access is not set

jasonlpang commented 7 years ago

Also tried without success:

fszyang commented 7 years ago

Try removing the /etc/network/if-up.d/static_arp file on invader 16 and see if that changes the behavior. That file is creating static arp every time an interface comes up.

From: jasonlpang [mailto:notifications@github.com] Sent: Thursday, May 25, 2017 12:16 AM To: platinasystems/go go@noreply.github.com Cc: fszyang fyang@platinasystems.com; Mention < mention@noreply.github.com> Subject: Re: [platinasystems/go] "invalid memory address or nil pointer dereference" after 'goes start' (#33)

Also tried without success:

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/platinasystems/go/issues/33#issuecomment-303941393, or mute the thread https://github.com/notifications/unsubscribe-auth/AMoeDe4NXmlONXp1YO3ubX8m4QprvdfFks5r9SqggaJpZM4MnI8q .[image: Image removed by sender.]

dlobete commented 7 years ago

This looks like the same panic as in #63.

sandeep-dutta commented 5 years ago

The issue is reproducible on the following goes & kernel version Goes version

root@invader29:/home/sandeep# goes vnetd -version fe1: v1.1.3 fe1a: v1.1.0 vnet-platina-mk1: v1.0.0

Kernel version

root@invader29:/home/sandeep# dpkg --list |grep kernel ii linux-image-4.13-platina-mk1 4.13-165-gbf3b5fef4591 amd64 Linux kernel, version 4.13-platina-mk1

When goes is restarted the below logs have been observed under /var/log/syslog

Jan 11 01:11:08 debian goes.vnetd[29200]: panic: runtime error: invalid memory address or nil pointer dereference Jan 11 01:11:08 debian goes.vnetd[29200]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x5ab277] Jan 11 01:11:08 debian goes.vnetd[29200]: Jan 11 01:11:08 debian goes.vnetd[29200]: goroutine 7 [running]: Jan 11 01:11:08 debian goes.vnetd[29200]: net.(*UnixConn).ReadMsgUnix(0x0, 0xc0002b8000, 0x1000, 0x2600, 0xc0002ba600, 0x1000, 0x2600, 0x0, 0x0, 0x0, ...) Jan 11 01:11:08 debian goes.vnetd[29200]: /usr/local/go/src/net/unixsock.go:137 +0x37 Jan 11 01:11:08 debian goes.vnetd[29200]: github.com/platinasystems/xeth.gorx() Jan 11 01:11:08 debian goes.vnetd[29200]: /home/fyang/gopath/pkg/mod/github.com/platinasystems/xeth@v1.1.1/xeth.go:277 +0x280 Jan 11 01:11:08 debian goes.vnetd[29200]: created by github.com/platinasystems/xeth.Start Jan 11 01:11:08 debian goes.vnetd[29200]: /home/fyang/gopath/pkg/mod/github.com/platinasystems/xeth@v1.1.1/xeth.go:80 +0x256 Jan 11 01:11:08 debian goes.vnetd[29200]: exit status 2 Jan 11 01:11:12 debian goes.goes-daemons[29189]: done Jan 11 01:11:15 debian kernel: ixgbe 0000:03:00.0 eth1: NIC Link is Down Jan 11 01:11:15 debian kernel: ixgbe 0000:03:00.1 eth2: NIC Link is Down Jan 11 01:11:16 debian kernel: vfio-pci 0000:04:00.0: enabling device (0000 -> 0002) Jan 11 01:11:16 debian kernel: vfio-pci 0000:04:00.1: enabling device (0000 -> 0002) Jan 11 01:11:16 debian kernel: ixgbe 0000:03:00.0 eth1: NIC Link is Up 10 Gbps, Flow Control: RX/TX Jan 11 01:11:16 debian kernel: ixgbe 0000:03:00.1 eth2: NIC Link is Up 10 Gbps, Flow Control: RX/TX Jan 11 01:11:18 debian vnet-platina-mk1.vnet-platina-mk1[3203]: hget timeout: ERROR vnet.xeth2-1.speed: not found in platina-mk1 Jan 11 01:11:19 debian vnet-platina-mk1.vnet-platina-mk1[3203]: port 2 installed: Id: QSFP+, Compliance: 40G CR, Vendor: Fiberstore, Part Number QSFP-40G-DAC, Revision 0x41200305, Serial F07A2700037-1, Date 171120, Connector Type: No separable connector

rondv commented 5 years ago

I don't see how the adj panic from last May through August is related to the current issue, other than both apparently resulted in vnet crash. Prior issue was in the adj handling.

What are the repro steps with current build?

If issue not related to #33, then please open new issue

rondv commented 5 years ago

I don't see how the adj panic from last May through August is related to the current issue, other than both apparently resulted in vnet crash. Prior issue was in the adj handling.

What are the repro steps with current build?

sandeep-dutta commented 5 years ago

This issue happens when we try to reload platina-mk1 module along with goes restart cmd.

rmmod platina-mk1 modprobe platina-mk1 provision=1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1 ifdown -a --allow vnet ifup -a --allow vnet goes restart

fszyang commented 5 years ago

Is this reproducible on all the invaders in the regression testbed or only some?

Can some please sent the /etc/network/interface on the invader for which this test failed?

Thx

-Frank

On Jan 14, 2019, at 12:54 PM, sandeep-dutta notifications@github.com wrote:

This issue happens when we try to reload platina-mk1 module along with goes restart cmd.

rmmod platina-mk1 modprobe platina-mk1 provision=1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1 ifdown -a --allow vnet ifup -a --allow vnet goes restart

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/platinasystems/go/issues/33#issuecomment-453901998, or mute the thread https://github.com/notifications/unsubscribe-auth/AMoeDePyVjk9W4BFHwiA3ef1wLoohMVvks5vDA1ygaJpZM4MnI8q.

sandeep-dutta commented 5 years ago

It is reproducible with all the invaders on the test bed. Attaching /etc/network/interfaces file of i-29

i29 interface file.txt

rondv commented 5 years ago

old title: "invalid memory address or nil pointer dereference" after 'goes start'

Sandeep: Please restore old title and close #33, then open new bug with this title and assign to me and Tom. Include all info from last week in the new bug. We should never track two bugs in the same issue#

Tom: Ignore everything in case history prior to last week. It looks like rxbuf fails to get a free buffer from Pool.Get() after 'go restart'

I'll try 'goes restart' on my system with i29 interface file and report findings here if I see anything interesting

rondv commented 5 years ago

Added i29 interface file then did ifdown/ifup on all interfaces to sync xeth and vnet to /etc/network

I did see a crash once, but not the one described in the bug So far I only saw this issue once, will open separate case once able to repro

Jan 17 14:41:14 invader4 goes.vnetd[960]: panic: vnet: runtime error: invalid memory address or nil pointer dereference Jan 17 14:41:14 invader4 goes.vnetd[960]: goroutine 160 [running]: Jan 17 14:41:14 invader4 goes.vnetd[960]: runtime/debug.Stack(0xaee9c0, 0xc001ca8ce0, 0xc001be9b50) Jan 17 14:41:14 invader4 goes.vnetd[960]: /usr/local/go/src/runtime/debug/stack.go:24 +0xa7 Jan 17 14:41:14 invader4 goes.vnetd[960]: github.com/platinasystems/elib/loop.(Loop).eventHandler.func1(0xc000216000, 0xc000216550) Jan 17 14:41:14 invader4 goes.vnetd[960]: /home/rtaylor/gopath/pkg/mod/github.com/platinasystems/elib@v1.1.0/loop/event.go:189 +0x190 Jan 17 14:41:14 invader4 goes.vnetd[960]: panic(0xb040a0, 0x16141c0) Jan 17 14:41:14 invader4 goes.vnetd[960]: /usr/local/go/src/runtime/panic.go:513 +0x1b9 Jan 17 14:41:14 invader4 goes.vnetd[960]: github.com/platinasystems/vnet/ip4.(Main).ForeachUnresolved(0xc000446000, 0xc001be9e78) Jan 17 14:41:14 invader4 goes.vnetd[960]: /home/rtaylor/gopath/src/github.com/platinasystems/vnet/ip4/fib.go:403 +0x170 Jan 17 14:41:14 invader4 goes.vnetd[960]: main.(unresolvedArper).EventAction(0xc000216c10) Jan 17 14:41:14 invader4 goes.vnetd[960]: /home/rtaylor/gopath/src/github.com/platinasystems/vnet-platina-mk1/unresolvedArper.go:35 +0xab Jan 17 14:41:14 invader4 goes.vnetd[960]: github.com/platinasystems/elib/loop.(nodeEvent).do(0xc0001e4d00) Jan 17 14:41:14 invader4 goes.vnetd[960]: /home/rtaylor/gopath/pkg/mod/github.com/platinasystems/elib@v1.1.0/loop/event.go:158 +0x99 Jan 17 14:41:14 invader4 goes.vnetd[960]: github.com/platinasystems/elib/loop.(Loop).eventHandler(0xc000216000, 0xdd60a0, 0xc000216548) Jan 17 14:41:14 invader4 goes.vnetd[960]: /home/rtaylor/gopath/pkg/mod/github.com/platinasystems/elib@v1.1.0/loop/event.go:228 +0xa3 Jan 17 14:41:14 invader4 goes.vnetd[960]: created by github.com/platinasystems/elib/loop.(Node).maybeStartEventHandler.func1 Jan 17 14:41:14 invader4 goes.vnetd[960]: /home/rtaylor/gopath/pkg/mod/github.com/platinasystems/elib@v1.1.0/loop/event.go:389 +0x228