datadope-io / skydive

An open source real-time network topology and protocols analyzer
https://skydive.network
Apache License 2.0
1 stars 0 forks source link

proccon: panic trying to delete software nodes with TCPConn/TCPListen defined but empty #11

Closed adrianlzt closed 3 years ago

adrianlzt commented 3 years ago

If Metadata.TCPConn/TCPListen is empty ({}), the garbage collector panics.

Steps to reproduce:

curl 127.0.0.1:8082/api/node -H "Content-Type: application/json" -d "{\"ID\": \"nodeA\", \"CreatedAt\": $(date +%s%3N), \"UpdatedAt\": $(date +%s%3N), \"Metadata\": {\"TCPConn\": {}, \"Name\": \"foo1\", \"Type\": \"Software\"}}"

Wait till next GC:

2021-03-11T07:47:09.804Z        DEBUG   proccon/proccon.go:551 (*Probe).garbageCollector        arco: Executing garbageCollector
2021-03-11T07:47:09.804Z        DEBUG   proccon/proccon.go:528 (*Probe).cleanSoftwareNodes      arco: GarbageCollector, delete network info older than 2021-03-11 04:47:09.804902154 +0000 UTC m=-10769.347012373. Processing 1 nodes
panic: interface conversion: interface {} is map[string]interface {}, not *proccon.NetworkInfo

goroutine 389 [running]:
github.com/skydive-project/skydive/topology/probes/proccon.(*Probe).removeOldNetworkInformation.func1(0x4b652a0, 0xc000d3e860, 0x0)
        /go/src/github.com/skydive-project/skydive/topology/probes/proccon/proccon.go:500 +0x1d5
github.com/skydive-project/skydive/graffiti/graph.(*Graph).UpdateMetadata(0xc000456ea0, 0x51c3c00, 0xc0009441b0, 0x544be82, 0x7, 0xc000f78dc8, 0x8243ca3, 0x4)
        /go/src/github.com/skydive-project/skydive/graffiti/graph/graph.go:877 +0xf9
github.com/skydive-project/skydive/topology/probes/proccon.(*Probe).removeOldNetworkInformation(0xc00049e480, 0xc0009441b0, 0xc00a86136ff9d50a, 0xfffff63490e598eb, 0x8455dc0, 0x0, 0x0)
        /go/src/github.com/skydive-project/skydive/topology/probes/proccon/proccon.go:510 +0x107
github.com/skydive-project/skydive/topology/probes/proccon.(*Probe).cleanSoftwareNodes(0xc00049e480, 0xc00a86136ff9d50a, 0xfffff63490e598eb, 0x8455dc0)
        /go/src/github.com/skydive-project/skydive/topology/probes/proccon/proccon.go:531 +0x2b9
github.com/skydive-project/skydive/topology/probes/proccon.(*Probe).garbageCollector(0xc00049e480, 0x6fc23ac00, 0x9d29229e000)
        /go/src/github.com/skydive-project/skydive/topology/probes/proccon/proccon.go:552 +0x131
created by github.com/skydive-project/skydive/topology/probes/proccon.(*Probe).Start
        /go/src/github.com/skydive-project/skydive/topology/probes/proccon/proccon.go:616 +0x408

proccon should not panic, but return a warning message.

This could be reached if some node sends some connection data, the GC delete all network info in the first pass and panics in the second pass.

adrianlzt commented 3 years ago

If skydive is restarted, and its using ES as backend, that {} value in Metadata.TCPConn/TCPListen is parsed correctly and does not produce a panic.

adrianlzt commented 3 years ago

Looks like it was only happening if TCPListen/TCPConn was defined at node creation time with the API.

Has been fixed (no more panics, just warning message) in 2d4a0e33d1f0c1b2acd4f2b11f2789124ce6122f (proccon: check network info has the expected format)