osrg / gobgp

BGP implemented in the Go Programming Language
https://osrg.github.io/gobgp/
Apache License 2.0
3.59k stars 684 forks source link

SEGV when typing "gobgp global rib" #2233

Open bakulkhanna opened 4 years ago

bakulkhanna commented 4 years ago

Hello,

I'm running gobgp ver 2.10.1 on several servers. On a few of these servers I get the following SEGV - when I type in - gobgp global rib. On other servers this SEGV does not happen. Any suggestions on how to fix?

panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0xa608fe]

goroutine 1 [running]: main.showNeighborRib(0xc38c8c, 0x6, 0x0, 0x0, 0x1271ee0, 0x0, 0x0, 0xc000038098, 0xd1eea0) /home/bkhanna/go/src/github.com/osrg/gobgp/cmd/gobgp/neighbor.go:933 +0x87e main.showGlobalRib(0x1271ee0, 0x0, 0x0, 0x0, 0x0) /home/bkhanna/go/src/github.com/osrg/gobgp/cmd/gobgp/global.go:1341 +0x5c main.newGlobalCmd.func2(0xc0001d6c80, 0x1271ee0, 0x0, 0x0) /home/bkhanna/go/src/github.com/osrg/gobgp/cmd/gobgp/global.go:1623 +0x3f github.com/spf13/cobra.(Command).execute(0xc0001d6c80, 0x1271ee0, 0x0, 0x0, 0xc0001d6c80, 0x1271ee0) /home/bkhanna/go/src/github.com/spf13/cobra/command.go:833 +0x2cc github.com/spf13/cobra.(Command).ExecuteC(0xc0001d6780, 0xc0001ee000, 0xc0001ee280, 0xc0001eea00) /home/bkhanna/go/src/github.com/spf13/cobra/command.go:917 +0x2f8 github.com/spf13/cobra.(*Command).Execute(0xc0001d6780, 0xc000191f88, 0xc000191f88) /home/bkhanna/go/src/github.com/spf13/cobra/command.go:867 +0x2b main.main() /home/bkhanna/go/src/github.com/osrg/gobgp/cmd/gobgp/main.go:32 +0x63

Thanks.

fujita commented 4 years ago

Hmm, parsing the prefix might fail at the following place: https://github.com/osrg/gobgp/blob/master/cmd/gobgp/neighbor.go#L932

The servers that you hit the SEGV at receives the exact same routes that the servers without this bug receives?

bakulkhanna commented 4 years ago

The servers that I see the SEGV on receive the exact same set of routes. The servers that I don't see the SEGV on have a different set of routes.

fujita commented 4 years ago

Can you try the following fix? https://github.com/fujita/gobgp/commit/ed52f3a52e2f9f9628f77637ea8a0ae34ff8c0a9

bakulkhanna commented 4 years ago

I swapped the statements from what you had, i.e. I did the following - I'm guessing this is what you had intended. if _, p, err := net.ParseCIDR(prefix); err != nil { fmt.Println("can't perse prefix string ", err) } else { l = append(l, &d{prefix: p.IP, dst: dst}) } This code change worked, i.e. I don't get a SEGV when doing a "gobgp global rib" in the same machine that used to SEGV with the original code. Thanks.

fujita commented 4 years ago

oops. Can you print the prefix that the cli failed to parse? We need to fix the root problem.

bakulkhanna commented 4 years ago

This is what it prints - once - when I do a "gobgp global rib" can't perse prefix string invalid CIDR address: /32

If you want to add more debug info to the above message, I'll be happy to test it out.

fujita commented 4 years ago

Hmm, empty address string.. Any suspicious route from other peers or you inject locally?

bakulkhanna commented 4 years ago

The routes are injected from a goBGP-client via gRPC. I can try to compare the routes at the client with what goBGP reports. It'll take a couple of days.

fujita commented 4 years ago

Great, it should be very useful. Thanks a lot!

bakulkhanna commented 4 years ago

Incorrectly formatted routes of the following kind were being sent over gRPC to gobgp, i.e two prefixes were squished into a single prefix field. ..172.242.201.14710.75.14.25..

bakulkhanna commented 4 years ago

I can see that such incorrectly formatted routes, when received, are not added to the global rib.

Is there a log message printed when such incorrect prefixes are seen? I was running gobgp with log-level set to 'info' and I didn't see any log messages - but I thought I'd ask if any messages are printed at the debug level for such prefixes?

imcom commented 4 years ago

if the faulty prefix was not inserted into global rib, how did you meet the invalid ones and crashed gobgp ?

imcom commented 4 years ago

@bakulkhanna how did you encode the incorrect prefix ? via NlriBinary or Nlri ? Update: Must be NlriBinary right ? Using the binary format can bypass net.Parse ... and because the Length is fixed for IPv4 / v6 anyway, then the parse can work in

func (r *IPAddrPrefixDefault) decodePrefix(data []byte, bitlen uint8, addrlen uint8) error {
    b := make([]byte, addrlen)
    copy(b, data[:bytelen])
}

But then I actually do not see how it could end up invalid later on. Could you please show your client code on how you encode the prefix ? If you are still interested

imcom commented 4 years ago

Also I guess the format check should be applied to

func NewIPAddrPrefix(length uint8, prefix string) *IPAddrPrefix {
    p := &IPAddrPrefix{
        IPAddrPrefixDefault{
            Length: length,
        },
        4,
    }
    p.IPAddrPrefixDefault.decodePrefix(net.ParseIP(prefix).To4(), length, 4)
    return p
}

at least, or even somewhere earlier, otherwise json format output will run into nil deference there also