datadope-io / skydive

An open source real-time network topology and protocols analyzer
https://skydive.network
Apache License 2.0
1 stars 0 forks source link

Lost edges caused by panic #10

Closed adrianlzt closed 3 years ago

adrianlzt commented 3 years ago

Skydive does not return many kinds of edges, but they are present in ES:

# docker exec -it -e SKYDIVE_ANALYZER=127.0.0.1:8081 skydive-analyzer skydive client query "G.E()" | grep RelationType | tr -d ',' | sort | uniq -c
    265                         "RelationType": "has_software"
    366                         "RelationType": "tcp_conn"

image

But they are all marked as deleted:

      {
        "_index" : "skydive_topology_live_v13",
        "_type" : "_doc",
        "_id" : "966664cc-e808-4f9d-b9c0-244b53cbaf68",
        "_score" : 0.43688482,
        "_source" : {
          "Origin" : "analyzer.LPROCMDBUAPP004",
          "ArchivedAt" : 1615040108338,
          "Revision" : 0,
          "Parent" : "73a25e23-a098-427e-bdd3-5805abbe74b2",
          "CreatedAt" : 1614859952305,
          "Metadata" : {
            "CreatedByCMDB" : "true",
            "RelationType" : "ownership"
          },
          "Host" : "LPROCMDBUAPP004",
          "_Type" : "edge",
          "ID" : "966664cc-e808-4f9d-b9c0-244b53cbaf68",
          "DeletedAt" : 1615040108338,
          "UpdatedAt" : 1614859952305,
          "Child" : "7a02424c-328d-4730-6a37-fbf702aceb80"
        }
      }

Is it normal that those deleted edges remain in the "live" index?

At 19:05 the node "73f5c8ee-5390-4a2b-80d3-92b463463caf" was correctly stored with its edges set.

Last node update: Thu Mar 4 13:16:10 CET 2021

Edge between nodes 73f5c8ee-5390-4a2b-80d3-92b463463caf and 57c39221-b16c-4e47-abe0-faaed1916e1d:

CreatedAt: Wed Mar 10 18:46:04 CET 2021 DeletedAt: Thu Mar 11 00:16:28 CET 2021

The delete date is the same as the last restart, forced by a panic:

2021-03-10T17:15:58.645Z        ESC[34mINFOESC[0m       analyzer/analyzer.go:55 glob..func1     LPROCMDBUAPP004: Skydive Analyzer started !
panic: interface conversion: interface {} is map[string]interface {}, not *proccon.NetworkInfo

goroutine 329 [running]:
github.com/skydive-project/skydive/topology/probes/proccon.(*Probe).removeOldNetworkInformation.func1(0x4d07620, 0xc000a07320, 0x0)
        /go/src/github.com/skydive-project/skydive/topology/probes/proccon/proccon.go:500 +0x1d5
github.com/skydive-project/skydive/graffiti/graph.(*Graph).UpdateMetadata(0xc0004cab40, 0x51c3c00, 0xc002254e10, 0x544be82, 0x7, 0xc003c1ddc8, 0x0, 0x0)
        /go/src/github.com/skydive-project/skydive/graffiti/graph/graph.go:877 +0xf9
github.com/skydive-project/skydive/topology/probes/proccon.(*Probe).removeOldNetworkInformation(0xc0004d2da0, 0xc002254e10, 0xc00a1e4ba5f84cc9, 0xffffc511d4abfe11, 0x8455dc0, 0x0, 0x0)
        /go/src/github.com/skydive-project/skydive/topology/probes/proccon/proccon.go:510 +0x107
github.com/skydive-project/skydive/topology/probes/proccon.(*Probe).cleanSoftwareNodes(0xc0004d2da0, 0xc00a1e4ba5f84cc9, 0xffffc511d4abfe11, 0x8455dc0)
        /go/src/github.com/skydive-project/skydive/topology/probes/proccon/proccon.go:531 +0x2b9
github.com/skydive-project/skydive/topology/probes/proccon.(*Probe).garbageCollector(0xc0004d2da0, 0x13a52453c000, 0x4e94914f0000)
        /go/src/github.com/skydive-project/skydive/topology/probes/proccon/proccon.go:552 +0x131
created by github.com/skydive-project/skydive/topology/probes/proccon.(*Probe).Start
        /go/src/github.com/skydive-project/skydive/topology/probes/proccon/proccon.go:616 +0x408
2021-03-10T23:16:25.459Z        ESC[34mINFOESC[0m       analyzer/analyzer.go:42 glob..func1     LPROCMDBUAPP004: Skydive Analyzer 0.27.0_-4 starting...

Looks like restarts caused by panic cause all edges to be deleted (there was another panic around 18:00): image

Probably, the panic produced all edges to be deleted (#11 ) and the edges seen were the ones created by proccon.

adrianlzt commented 3 years ago

Panic fixed in 2d4a0e33d. Edges deleted moved to #11