sttp / goapi

Golang API for STTP
https://sttp.github.io/goapi/
MIT License
3 stars 3 forks source link

Race conditions in connection state handling #4

Closed majiru closed 2 years ago

majiru commented 2 years ago

We are seeing a race condition crop up with the connection state handling when we see attempts to reestablish connections for a active subscription. This can happen particularly often if we create a lot of subscriptions at once, causing the upstream OpenPDC to reject at least a couple of them. It seems particularly common in cases in which the connection is reset before the code finishes the 'connection setup'. I have isolated a test case for reproducing this bug:

package repro

import (
    "log"
    "sync"
    "testing"

    "github.com/sttp/goapi/sttp"
)

func run(dst string, query string, count int) {
    sub := sttp.NewSubscriber()
    conf := sttp.NewConfig()

    sub.SetStatusMessageLogger(func(msg string) {
        log.Println("STATUS:", msg)
    })
    sub.SetErrorMessageLogger(func(msg string) {
        log.Println("ERROR:", msg)
    })
    conf.AutoRequestMetadata = false
    sub.Subscribe(query, nil)
    if err := sub.Dial(dst, conf); err != nil {
        log.Fatal(err)
    }
    reader := sub.ReadMeasurements()
    for sub.IsConnected() {
        _ = reader.NextMeasurement()
        count--
        if count <= 0 {
            break
        }
    }
    sub.Close()
}

func TestRepro(t *testing.T) {
    const (
        query  = "FILTER TOP 20 ActiveMeasurements WHERE SignalType = 'ALOG'"
        target = "PDC_ADDR"
    )
    wg := sync.WaitGroup{}
    wg.Add(100)
    for i := 0; i < 100; i++ {
        go func() {
            run(target, query, 1000)
            wg.Done()
        }()
    }
    wg.Wait()
}

This is formatted as a unit test to make it easier to run against the go race detector by doing go test -race

majiru commented 2 years ago

Any updates on this?

ritchiecarroll commented 2 years ago

Not yet - this is on my schedule for next week

ritchiecarroll commented 2 years ago

OK - I've checked in a fix for this. FYI, this is the first time I've used "go test -race", neat little tool.

Although I recognized most of these race conditions it found in the original code, I had deemed them safe or non-consequential for typical use cases.

For now, in this Go code update, I removed all feasible race conditions, including ones the tool did not find. This could have some impact on performance, but hopefully not too much. If it is noticed we can move back to the safe race scenarios.