aregm / nff-go

NFF-Go -Network Function Framework for GO (former YANFF)
BSD 3-Clause "New" or "Revised" License
1.38k stars 156 forks source link

various panics or hangs on flow.SystemStop() #667

Closed darinkes closed 4 years ago

darinkes commented 4 years ago

I see various panics on flow.SystemStop()

DEBUG: System is using 4 cores now. 8 cores are left available.
DEBUG: Current speed of 0 instance of segment1 is 2 PKT/S, cloneNumber: 1 queue number: 16
^CStopping...
Received an interrupt, stopping everything
DEBUG: Stop instance for receiverPort1
DEBUG: Stop clone
DEBUG: Stop instance for segment1
DEBUG: Stop clone
DEBUG: Stop instance for receiverPort1
DEBUG: Stop instance for senderPortThread1
DEBUG: Stop clone
panic: close of closed channel
                                                                                                                                                                                                                                                                                                                                                                                                                                                            goroutine 19 [running, locked to thread]:
github.com/intel-go/nff-go/flow.(*flowFunction).stopInstance(0xc000082600, 0x0, 0xffffffffffffffff, 0xc0001d0000)                                                                                                                                    /home/ubuntu/dpdk-playground/nff-go/flow/scheduler.go:333 +0x332                                                                                                                                                                     github.com/intel-go/nff-go/flow.(*scheduler).systemStop(0xc0001d0000)
        /home/ubuntu/dpdk-playground/nff-go/flow/scheduler.go:354 +0x7c
github.com/intel-go/nff-go/flow.SystemStop(0xc000061f60, 0x1)
        /home/ubuntu/dpdk-playground/nff-go/flow/flow.go:833 +0x31
github.com/intel-go/nff-go/flow.SystemStartScheduler(0x0, 0x0)
        /home/ubuntu/dpdk-playground/nff-go/flow/flow.go:807 +0x14a
github.com/intel-go/nff-go/flow.SystemStart(0x0, 0x0)
        /home/ubuntu/dpdk-playground/nff-go/flow/flow.go:822 +0x35
main.main.func1()
        /home/ubuntu/dpdk-playground/sniffer/sniffer.go:27 +0x22
created by main.main
        /home/ubuntu/dpdk-playground/sniffer/sniffer.go:26 +0xf2

Or sometimes the program even hangs at "DEBUG: Stop clone" till I kill it.

The Code looks like this:

package main

import (
        "fmt"
        "os"
        "os/signal"

        "github.com/intel-go/nff-go/flow"
        "github.com/intel-go/nff-go/packet"

        "github.com/google/gopacket"
        "github.com/google/gopacket/layers"
)

func main() {
        port := uint16(0)

        flow.CheckFatal(flow.SystemInit(nil))

        mainFlow, err := flow.SetReceiver(port)
        flow.CheckFatal(err)

        flow.CheckFatal(flow.SetHandler(mainFlow, handler, nil))
        flow.CheckFatal(flow.SetSender(mainFlow, 0))

        go func() {
                flow.CheckFatal(flow.SystemStart())
        }()

        c := make(chan os.Signal, 1)
        signal.Notify(c, os.Interrupt)
        <-c

        fmt.Println("Stopping...")

        flow.CheckFatal(flow.SystemStop())
}

func handler(packet *packet.Packet, context flow.UserContext) {
        gopacket := gopacket.NewPacket(packet.GetRawPacketBytes(), layers.LayerTypeEthernet, gopacket.Default)
        fmt.Printf("Packet: %v", gopacket)
}

My Devices:

Network devices using DPDK-compatible driver
============================================
0000:02:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb' drv=igb_uio unused=ixgbe
0000:02:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb' drv=igb_uio unused=ixgbe

Any Ideas?

darinkes commented 4 years ago

Ok, found it:

You have to to disable the builtin signal handler:

        config := flow.Config{
                NoSetSIGINTHandler: true,
        }
        flow.CheckFatal(flow.SystemInit(&config))
gshimansky commented 4 years ago

Nice finding. I forgot about this setting, and couldn't guess the cause of crash from stack trace.

darinkes commented 4 years ago

The crash happens the second time SystemStop() gets called. Maybe there should be a check in SystemStop() to just return if it is already stopped. But havent thought much about it yet.

aregm commented 4 years ago

@darinkes Stefan, can you please send some details about the project that you are working on with NFF-Go?

darinkes commented 4 years ago

@aregm Thanks for asking. Currently I'm just starting to learn more about DPDK in general and Network Functions in Cloud Environments. Being not a hardcore C-Hacker this library gives me a nice way to work with DPDK.