Crahsloop with `panic: assignment to entry in nil map`

etiennedi commented 9 months ago

How to reproduce this bug?

k8s cluster, small nodes with tight memory limits
many pods and a high replication factor (for example 6)
badly set GOMEMLIMIT and GOGC so OOMKills are likely
Make sure there is constant import load, for example with this script.
For additional chaos constantly resize the cluster – this leads to a lot of pods being evicted and rescheduled on other nodes

What is the expected behavior?

Everything is crash-resistant

What is the actual behavior?

After a while at least one of the pods is going to be in a crash loop logging something like the following:

panic: assignment to entry in nil map

goroutine 600 [running]:
github.com/weaviate/weaviate/adapters/repos/db/inverted.(*JsonPropertyLengthTracker).TrackProperty(0xc004b64240, {0xc003a6c290, 0x5}, 0x40a00000)
        /go/src/github.com/weaviate/weaviate/adapters/repos/db/inverted/new_prop_length_tracker.go:181 +0x3c8
github.com/weaviate/weaviate/adapters/repos/db.(*Shard).SetPropertyLengths(0xc003724b40, {0xc0063f1a40?, 0x4, 0x29cace0?})
        /go/src/github.com/weaviate/weaviate/adapters/repos/db/shard_write_inverted_lsm.go:217 +0x8f
github.com/weaviate/weaviate/adapters/repos/db.(*Shard).updateInvertedIndexLSM(0xc003724b40, 0xc0063f1880, {0xc0063a5310?, 0xe0?, 0x10?}, {0x0, 0x0, 0x0})
        /go/src/github.com/weaviate/weaviate/adapters/repos/db/shard_write_put.go:299 +0x5ea
github.com/weaviate/weaviate/adapters/repos/db.(*Shard).putObjectLSM(0xc003724b40, 0xc0063f1880, {0xc0063a5310, 0x10, 0x10})
        /go/src/github.com/weaviate/weaviate/adapters/repos/db/shard_write_put.go:164 +0x725
github.com/weaviate/weaviate/adapters/repos/db.(*objectsBatcher).storeObjectOfBatchInLSM(0xc005405780, {0x1da6900, 0x29feb80}, 0x0?, 0xc0063f1880)
        /go/src/github.com/weaviate/weaviate/adapters/repos/db/shard_write_batch_objects.go:201 +0xd1
github.com/weaviate/weaviate/adapters/repos/db.(*objectsBatcher).storeSingleBatchInLSM.func1(0xf9, 0x0?)
        /go/src/github.com/weaviate/weaviate/adapters/repos/db/shard_write_batch_objects.go:173 +0x115
created by github.com/weaviate/weaviate/adapters/repos/db.(*objectsBatcher).storeSingleBatchInLSM in goroutine 283
        /go/src/github.com/weaviate/weaviate/adapters/repos/db/shard_write_batch_objects.go:159 +0xc5

Supporting information

Discovered as part of #4125 investigations.

Server Version

v1.23.7

Code of Conduct

[X] I have read and agree to the Weaviate's Contributor Guide and Code of Conduct

etiennedi commented 9 months ago

Keep open until all #4125 changes are released, then re-evaluate. This should by fixed by those changes, too.

dirkkul commented 3 weeks ago

I think this is resolved

weaviate / weaviate

Crahsloop with `panic: assignment to entry in nil map` #4128