vasya4k / gopcep

Implementation of PCEP and a TE Controller written in Go
MIT License
15 stars 4 forks source link

panic in http server #4

Open paddy01 opened 6 months ago

paddy01 commented 6 months ago

When trying to remove a controlled LSP I get

http2: panic serving 192.168.1.2:53698: runtime error: invalid memory address or nil pointer dereference
goroutine 341 [running]:
net/http.(*http2serverConn).runHandler.func1()
        /usr/local/go/src/net/http/h2_bundle.go:6146 +0x145
panic({0xcb0a20?, 0x2709170?})
        /usr/local/go/src/runtime/panic.go:770 +0x132
gopcep/controller.(*Controller).DelSRLSP(0xc000170000, {0xc0002a5fa8, 0x5})
        /opt/gopcep/controller/controller.go:82 +0x154
gopcep/restapi.(*handler).delLSP(0xc000484e40, 0xc00012ed00)
        /opt/gopcep/restapi/lsp.go:40 +0xa5
github.com/gin-gonic/gin.(*Context).Next(0xc00012ed00)
        /root/go/pkg/mod/github.com/gin-gonic/gin@v1.7.2/context.go:165 +0x2b
gopcep/restapi.Start.jsonLogMiddleware.func4(0xc00012ed00)
        /opt/gopcep/restapi/restapi.go:62 +0x46
github.com/gin-gonic/gin.(*Context).Next(...)
        /root/go/pkg/mod/github.com/gin-gonic/gin@v1.7.2/context.go:165
github.com/gin-gonic/gin.(*Engine).handleHTTPRequest(0xc000311860, 0xc00012ed00)
        /root/go/pkg/mod/github.com/gin-gonic/gin@v1.7.2/gin.go:489 +0x650
github.com/gin-gonic/gin.(*Engine).ServeHTTP(0xc000311860, {0x213cd30, 0xc0005a4350}, 0xc0004f6900)
        /root/go/pkg/mod/github.com/gin-gonic/gin@v1.7.2/gin.go:445 +0x198
net/http.serverHandler.ServeHTTP({0x0?}, {0x213cd30?, 0xc0005a4350?}, 0x0?)
        /usr/local/go/src/net/http/server.go:3137 +0x8e
net/http.initALPNRequest.ServeHTTP({{0x213dea8?, 0xc0001d9bf0?}, 0xc000005888?, {0xc0004d83c0?}}, {0x213cd30, 0xc0005a4350}, 0xc0004f6900)
        /usr/local/go/src/net/http/server.go:3745 +0x231
net/http.(*http2serverConn).runHandler(0x0?, 0x0?, 0x0?, 0x0?)
        /usr/local/go/src/net/http/h2_bundle.go:6153 +0xbb
created by net/http.(*http2serverConn).scheduleHandler in goroutine 48
        /usr/local/go/src/net/http/h2_bundle.go:6088 +0x21d
vasya4k commented 5 months ago

There is some kind of race condition here. As the PCEP session does not have that LSP anymore but the controller has it. I can add a check and log but need to really understand how did this happen.

vasya4k commented 5 months ago

Added the check. You should see the log.

paddy01 commented 5 months ago

Added the check. You should see the log.

Well, it's not crashing anymore.. the only thing that popup in the logs are

time="2024-03-08T13:39:39.670833172+01:00" level=error client_ip="172.18.xx.xx:59059" duration=0.106676 method=DELETE path=/v1/lsp/lsp-num1_1 referrer="https://172.16.xx.xx1:1443/" request_id= status=500
vasya4k commented 5 months ago

This is how it should be as I suspect the LSPs do not get created on the devices themselves so when you try to delete there is nothing to delete. There are two DBs the first one is the controller DB and the second one is the network DB. You create LSPs in the controller DB and they get pushed into the network. Once pushed the devices report all of them back and they get saved into the network DB. We need to make sure they actually get created first.

paddy01 commented 5 months ago

This is how it should be as I suspect the LSPs do not get created on the devices themselves so when you try to delete there is nothing to delete.

Its hard to delete a non-functioning LSP if it has to be up to be able to remove it. Even if the LSP is not up, for whatever reason, I must be able to remove it from the PCE or at least put it in delete state so it gets removed when the PCC connects(if it was disconnected for some reaosn) - and that must be indicated in the webUI :)

There are two DBs the first one is the controller DB and the second one is the network DB. You create LSPs in the controller DB and they get pushed into the network. Once pushed the devices report all of them back and they get saved into the network DB. We need to make sure they actually get created first.

Yea I know, but as stated above if the information is wrong so the LSP doesn't get established there must be a way to delete it and/or put it in deleted state to later be removed from local db.