nspcc-dev / neofs-node

NeoFS is a decentralized distributed object storage integrated with the Neo blockchain
https://fs.neo.org
GNU General Public License v3.0
32 stars 38 forks source link

Panic in binary replication #2978

Open carpawell opened 4 weeks ago

carpawell commented 4 weeks ago
окт 23 14:47:46 metis3 neofs-node[2911]: 2024/10/23 14:47:46.517936 [ants]: worker exits from panic: runtime error: slice bounds out of range [109:0]
окт 23 14:47:46 metis3 neofs-node[2911]: goroutine 340953791 [running]:
окт 23 14:47:46 metis3 neofs-node[2911]: runtime/debug.Stack()
окт 23 14:47:46 metis3 neofs-node[2911]:         runtime/debug/stack.go:24 +0x5e
окт 23 14:47:46 metis3 neofs-node[2911]: github.com/panjf2000/ants/v2.(*goWorker).run.func1.1()
окт 23 14:47:46 metis3 neofs-node[2911]:         github.com/panjf2000/ants/v2@v2.9.0/worker.go:56 +0x85
окт 23 14:47:46 metis3 neofs-node[2911]: panic({0x1102a00?, 0xc02d60d9c8?})
окт 23 14:47:46 metis3 neofs-node[2911]:         runtime/panic.go:770 +0x132
окт 23 14:47:46 metis3 neofs-node[2911]: github.com/nspcc-dev/neofs-node/pkg/services/object/put.putObjectLocally({0x13fc108, 0xc00033e930}, 0xc072bb18c0, {0x12827d10?, {0x0?, 0xc00d581d30?, 0xc>
окт 23 14:47:46 metis3 neofs-node[2911]:         github.com/nspcc-dev/neofs-node/pkg/services/object/put/local.go:80 +0x34d
окт 23 14:47:46 metis3 neofs-node[2911]: github.com/nspcc-dev/neofs-node/pkg/services/object/put.(*localTarget).Close(0xc0331a59d0)
окт 23 14:47:46 metis3 neofs-node[2911]:         github.com/nspcc-dev/neofs-node/pkg/services/object/put/local.go:51 +0x4f
окт 23 14:47:46 metis3 neofs-node[2911]: github.com/nspcc-dev/neofs-node/pkg/services/object/put.(*distributedTarget).sendObject(0xc03357a800, {0x1, {{0xc032dfb7b0, 0x1, 0x1}, {0x0, 0x0, 0x0}, {>
окт 23 14:47:46 metis3 neofs-node[2911]:         github.com/nspcc-dev/neofs-node/pkg/services/object/put/distributed.go:208 +0x162
окт 23 14:47:46 metis3 neofs-node[2911]: github.com/nspcc-dev/neofs-node/pkg/services/object/put.(*distributedTarget).iteratePlacement.func1()
окт 23 14:47:46 metis3 neofs-node[2911]:         github.com/nspcc-dev/neofs-node/pkg/services/object/put/distributed.go:250 +0x11c
окт 23 14:47:46 metis3 neofs-node[2911]: github.com/panjf2000/ants/v2.(*goWorker).run.func1()
окт 23 14:47:46 metis3 neofs-node[2911]:         github.com/panjf2000/ants/v2@v2.9.0/worker.go:67 +0x8d
окт 23 14:47:46 metis3 neofs-node[2911]: created by github.com/panjf2000/ants/v2.(*goWorker).run in goroutine 340954422
окт 23 14:47:46 metis3 neofs-node[2911]:         github.com/panjf2000/ants/v2@v2.9.0/worker.go:48 +0x5c

Expected Behavior

No panic

Current Behavior

Panic

Possible Solution

Do not do out of range?

Steps to Reproduce (for bugs)

Not sure, some high load with blocks that @AnnaShaleva and @AliceInHunterland usually do.

Context

Loading neo-go blocks to NeoFS.

Your Environment

NeoFS Storage node Version: 0.43.0 GoVersion: go1.22.6

carpawell commented 4 weeks ago

It always had util/log.go:19 could not push task to worker pool {"request": "PUT", "error": "too many goroutines blocked on submit or Nonblocking is set"} log before panic.

roman-khimov commented 2 weeks ago
  1. We have a lot (thousands) of these on all machines.
  2. Happens during load.
  3. Not related to network map changes.
  4. Offset is always 109:0.
  5. Tends to come in series of 3-5 events at sub-second level on a single machine.
carpawell commented 1 week ago

@roman-khimov, i think we found the root reason for it? Also we know that it will not be possible in 0.44

roman-khimov commented 1 week ago

It will be possible if you won't fix it. master can still have this problem.