z5labs / gogm

Golang Object Graph Mapper for Neo4j
MIT License
140 stars 31 forks source link

`gogm: data not found` with intense loading #68

Open lomoval opened 3 years ago

lomoval commented 3 years ago

Hello. I got a strange situation with GOGM and intense loading of objects. Sometimes GOGM returns "gogm: data not found" instead of loaded objects or timeout-error.

Here's some code example. It creates go-routines that loads objects and outputs the result (error or not).

func gogmTest() {
    config := &gogm.Config{
        Host:          "127.0.0.1",
        Port:          7687,
        Username:      "neo4j",
        Password:      "neo4j",
        PoolSize:      100,
        IndexStrategy: gogm.IGNORE_INDEX,
        DefaultTransactionTimeout: time.Second * 10,
    }

    _gogm, _ := gogm.New(config, IdPolicy(), globalTypes...)
    var wg sync.WaitGroup
    for i := 1; i < 70; i++ {
        wg.Add(1)
        go func(i int) {
                        sensors := []Sensor{}
                        sess, _ := _gogm.NewSessionV2(gogm.SessionConfig{AccessMode: gogm.AccessModeRead})
            defer wg.Done()
            defer sess.Close()
            whereFilter := gocypherdsl.C(&gocypherdsl.ConditionConfig{Name: "n", Field: "name", ConditionOperator: gocypherdsl.EqualToOperator, Check: "CupWndSpdMeasVirtual"})
            err := sess.LoadAllDepthFilter(context.Background(), &sensors, 0, whereFilter, nil)

            if err != nil && errors.Is(err, gogm.ErrNotFound) {
                fmt.Printf("%d - Not found - %s\n", i, err.Error())
            } else  if err != nil {
                fmt.Printf("%d - Other error - %s\n", i, err.Error())
            } else {
                fmt.Printf("%d - OK - %s\n", i, sensors[0].Name)
            }
        }(i)
    }
    wg.Wait()
    fmt.Printf("DONE\n")
}

And the output. Timeout errors are OK, but why does GOGM return gogm: data not found?

42 - OK - CupWndSpdMeasVirtual
23 - OK - CupWndSpdMeasVirtual
15 - Not found - failed auto read tx, no primary nodes to return, gogm: data not found
33 - Other error - failed auto read tx, Neo4jError: Neo.ClientError.Transaction.TransactionTimedOut (...)
3 - Not found - failed auto read tx, no primary nodes to return, gogm: data not found
13 - Not found - failed auto read tx, no primary nodes to return, gogm: data not found
12 - Not found - failed auto read tx, no primary nodes to return, gogm: data not found
21 - Other error - failed auto read tx, Neo4jError: Neo.ClientError.Transaction.TransactionTimedOut (...)
...
53 - Not found - failed auto read tx, no primary nodes to return, gogm: data not found
56 - Not found - failed auto read tx, no primary nodes to return, gogm: data not found
57 - Not found - failed auto read tx, no primary nodes to return, gogm: data not found
69 - Not found - failed auto read tx, no primary nodes to return, gogm: data not found
52 - Not found - failed auto read tx, no primary nodes to return, gogm: data not found
19 - Not found - failed auto read tx, no primary nodes to return, gogm: data not found
...
61 - OK - CupWndSpdMeasVirtual
63 - OK - CupWndSpdMeasVirtual
67 - OK - CupWndSpdMeasVirtual
DONE
lomoval commented 5 months ago

Some additions.

I think the problem is not the intense load itself, but the transaction timeout and processing of the results.

  1. If timeout occurs before any results come to gogm you will get failed auto read tx, Neo4jError: Neo.ClientError.Transaction.TransactionTimedOut (...)
  2. If a timeout occurs during result processing (gogm starts processing query results using result.Next), you will get automatic tx reading failed, no primary nodes to return, gogm: no data found and GOGM wraps non found error with transaction error.

Example (try to get nodes with enough timeout and too small):

func EmbedType() {
        // In NEO 
    // match (a:KnowledgePage) where a.name contains 'e' return count(a)
    //
    // count(a)
    // 27

    conf := gogm.Config{
        Host:                      "127.0.0.1",
        Port:                      7687,
        Username:                  "neo4j",
        Password:                  "neo4j",
        PoolSize:                  50,
        IndexStrategy:             gogm.IGNORE_INDEX,
        LoadStrategy:              gogm.SCHEMA_LOAD_STRATEGY,
        DefaultTransactionTimeout: time.Millisecond,
    }

    confLongTimeout := conf
    confLongTimeout.DefaultTransactionTimeout = 5000 * time.Millisecond

    checkTimeout(conf)
    checkTimeout(confLongTimeout)
    return
}

func checkTimeout(conf gogm.Config) {
    g, err := gogm.New(&conf, gogm.DefaultPrimaryKeyStrategy, &model.KnowledgePage{})
    if err != nil {
        log.Err(err).Msg("")
        return
    }

    s, err := g.NewSessionV2(gogm.SessionConfig{AccessMode: gogm.AccessModeRead})
    if err != nil {
        log.Err(err).Msg("")
        return
    }

    var pages []*model.KnowledgePage
    err = s.LoadAllDepthFilter(context.Background(), &pages, 0, cypherdsl.C(&cypherdsl.ConditionConfig{
        ConditionOperator: cypherdsl.ContainsOperator,
        Name:              "n",
        Field:             "name",
        Check:             "e",
    }), nil)

    if err != nil {
        log.Err(err).Msg("")
        return
    }
    log.Debug().Msgf("Nodes count: %d", len(pages))
    return
}

Output:

...
{"level":"error","error":"failed auto read tx, no primary nodes to return, gogm: data not found","time":"2024-06-05T18:35:48+03:00"}
...
{"level":"debug","time":"2024-06-05T18:35:54+03:00","message":"Nodes count: 27"}