milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
31.03k stars 2.95k forks source link

[Bug]: Milvus encountered a panic while running full-text search test cases, with an error message `panic: runtime error: invalid memory address or nil pointer dereference` #36949

Closed zhuwenxing closed 1 month ago

zhuwenxing commented 1 month ago

Is there an existing issue for this?

Environment

- Milvus version:master-4d08eec-20241017
- Deployment mode(standalone or cluster):
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

[2024/10/17 06:12:14.781 +00:00] [INFO] [segments/segment_loader.go:815] ["Finish loading segment"] [traceID=d85c94c5f4f000c675e7999f7dadccb6] [segmentID=453282341957843091] [loadFieldsIndexSpan=10.632µs] [complementScalarDataSpan=2.016µs] [loadRawDataSpan=34.795128ms] [patchEntryNumberSpan=540ns] [loadTextIndexesSpan=326ns]
I20241017 06:12:14.784330   688 SegmentSealedImpl.cpp:286] [SERVER][LoadFieldData][milvus] segment 453282341957842175 loads field 103 mmap false done
I20241017 06:12:14.784876   693 SegmentSealedImpl.cpp:286] [SERVER][LoadFieldData][milvus] segment 453282341957843999 loads field 102 mmap false done
github.com/milvus-io/milvus/pkg/util/typeutil.(*ConcurrentMap[go.shape.int64,go.shape.map[int64]*github.com/milvus-io/milvus/internal/storage.BM25Stats]).Range
    /go/src/github.com/milvus-io/milvus/pkg/util/typeutil/map.go:51 pc=0x5bb02b8
github.com/milvus-io/milvus/internal/querynodev2/delegator.(*shardDelegator).loadStreamDelete
    /go/src/github.com/milvus-io/milvus/internal/querynodev2/delegator/delegator_data.go:694 pc=0x5bb02b8

[2024/10/17 06:12:16.415 +00:00] [INFO] [segments/segment.go:965] ["load field done"] [traceID=d85c94c5f4f000c675e7999f7dadccb6] [collectionID=453282341956800481] [partitionID=453282341956800493] [segmentID=453282341957842175] [fieldID=103] [rowCount=241]
[2024/10/17 06:12:16.415 +00:00] [INFO] [segments/segment_loader.go:989] ["load field binlogs done for sealed segment"] [traceID=d85c94c5f4f000c675e7999f7dadccb6] [collection=453282341956800481] [segment=453282341957842175] [len(field)=9] [segmentType=Sealed]
[2024/10/17 06:12:16.415 +00:00] [INFO] [segments/segment_loader.go:815] ["Finish loading segment"] [traceID=d85c94c5f4f000c675e7999f7dadccb6] [segmentID=453282341957842175] [loadFieldsIndexSpan=13.41µs] [complementScalarDataSpan=4.101µs] [loadRawDataSpan=1.671292536s] [patchEntryNumberSpan=454ns] [loadTextIndexesSpan=388ns]
[2024/10/17 06:12:16.415 +00:00] [INFO] [segments/segment.go:965] ["load field done"] [traceID=d85c94c5f4f000c675e7999f7dadccb6] [collectionID=453282341956800481] [partitionID=453282341956800488] [segmentID=453282341957843999] [fieldID=102] [rowCount=208]
[2024/10/17 06:12:16.415 +00:00] [INFO] [segments/segment_loader.go:989] ["load field binlogs done for sealed segment"] [traceID=d85c94c5f4f000c675e7999f7dadccb6] [collection=453282341956800481] [segment=453282341957843999] [len(field)=9] [segmentType=Sealed]
[2024/10/17 06:12:16.415 +00:00] [INFO] [segments/segment_loader.go:815] ["Finish loading segment"] [traceID=d85c94c5f4f000c675e7999f7dadccb6] [segmentID=453282341957843999] [loadFieldsIndexSpan=12.082µs] [complementScalarDataSpan=1.603µs] [loadRawDataSpan=1.670747956s] [patchEntryNumberSpan=521ns] [loadTextIndexesSpan=2.129µs]
[2024/10/17 06:12:16.429 +00:00] [DEBUG] [delegator/delegator_data.go:465] ["work loads segments done"] [traceID=d85c94c5f4f000c675e7999f7dadccb6] [collectionID=453282341956800481] [channel=full-text-search-test-master-rootcoord-dml_11_453282341956800481v0] [replicaID=453282342328336394] [workID=33] [segments="[453282341957842168]"]
[2024/10/17 06:12:16.429 +00:00] [INFO] [segments/segment_loader.go:558] ["start loading bm25 stats for remote..."] [collectionID=453282341956800481] [segmentIDs="[453282341957842168]"] [segmentNum=1]
[2024/10/17 06:12:16.429 +00:00] [INFO] [segments/segment_loader.go:566] ["loading bm25 stats for remote..."] [collectionID=453282341956800481] [segment=453282341957842168]
[2024/10/17 06:12:16.430 +00:00] [INFO] [querynodev2/services.go:203] ["received watch channel request"] [traceID=cc71179ce80324c77bb08fa800b9ec29] [collectionID=453237890346671313] [channel=full-text-search-test-master-rootcoord-dml_12_453237890346671313v2] [currentNodeID=43] [version=1729145535343371244]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x5bb02b8]

goroutine 52224 gp=0xc0017e6fc0 m=42 mp=0xc00185e808 [running]:
panic({0x628bd00?, 0x9babfe0?})
    /go/pkg/mod/golang.org/toolchain@v0.0.1-go1.22.0.linux-amd64/src/runtime/panic.go:779 +0x158 fp=0xc00129e750 sp=0xc00129e6a0 pc=0x1f82ad8
runtime.panicmem(...)
    /go/pkg/mod/golang.org/toolchain@v0.0.1-go1.22.0.linux-amd64/src/runtime/panic.go:261
runtime.sigpanic()
    /go/pkg/mod/golang.org/toolchain@v0.0.1-go1.22.0.linux-amd64/src/runtime/signal_unix.go:881 +0x378 fp=0xc00129e7b0 sp=0xc00129e750 pc=0x1f9d7d8
github.com/milvus-io/milvus/pkg/util/typeutil.(*ConcurrentMap[...]).Range(...)
    /go/src/github.com/milvus-io/milvus/pkg/util/typeutil/map.go:51
github.com/milvus-io/milvus/internal/querynodev2/delegator.(*shardDelegator).loadStreamDelete(0xc00269cea0, {0x71cdee8, 0xc003c3c420}, {0x0, 0x0, 0x0}, 0x0, {0x9ec62e0, 0x0, 0x0}, ...)
    /go/src/github.com/milvus-io/milvus/internal/querynodev2/delegator/delegator_data.go:694 +0xdb8 fp=0xc00129ec68 sp=0xc00129e7b0 pc=0x5bb02b8
github.com/milvus-io/milvus/internal/querynodev2/delegator.(*shardDelegator).LoadSegments(0xc00269cea0, {0x71cdee8, 0xc003c3c420}, 0xc0013300c0)
    /go/src/github.com/milvus-io/milvus/internal/querynodev2/delegator/delegator_data.go:505 +0xc05 fp=0xc00129f1b8 sp=0xc00129ec68 pc=0x5badbe5
github.com/milvus-io/milvus/internal/querynodev2.(*QueryNode).LoadSegments(0xc0016dc000, {0x71cdee8, 0xc003c3c420}, 0xc0013300c0)
    /go/src/github.com/milvus-io/milvus/internal/querynodev2/services.go:459 +0xbde fp=0xc00129f700 sp=0xc00129f1b8 pc=0x5bef87e
github.com/milvus-io/milvus/internal/distributed/querynode.(*Server).LoadSegments(0x9bb7980?, {0x71cdee8?, 0xc003c3c420?}, 0xc001143408?)
    /go/src/github.com/milvus-io/milvus/internal/distributed/querynode/service.go:301 +0x25 fp=0xc00129f730 sp=0xc00129f700 pc=0x5c08465
github.com/milvus-io/milvus/internal/proto/querypb._QueryNode_LoadSegments_Handler.func1({0x71cdee8?, 0xc003c3c420?}, {0x67f5d20?, 0xc0013300c0?})
    /go/src/github.com/milvus-io/milvus/internal/proto/querypb/query_coord_grpc.pb.go:2037 +0xcb fp=0xc00129f768 sp=0xc00129f730 pc=0x2ea3deb
github.com/milvus-io/milvus/internal/distributed/querynode.(*Server).startGrpcLoop.ServerIDValidationUnaryServerInterceptor.func7({0x71cdee8, 0xc003c3c420}, {0x67f5d20, 0xc0013300c0}, 0x2e0df34?, 0xc00302e090)
    /go/src/github.com/milvus-io/milvus/pkg/util/interceptor/server_id_interceptor.go:54 +0xe5 fp=0xc00129f7b0 sp=0xc00129f768 pc=0x5c07925
github.com/milvus-io/milvus/internal/distributed/querynode.(*Server).startGrpcLoop.ChainUnaryServer.func8.1.1({0x71cdee8?, 0xc003c3c420?}, {0x67f5d20?, 0xc0013300c0?})
    /go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware@v1.3.0/chain.go:25 +0x34 fp=0xc00129f7f0 sp=0xc00129f7b0 pc=0x5c077f4
github.com/milvus-io/milvus/internal/distributed/querynode.(*Server).startGrpcLoop.ClusterValidationUnaryServerInterceptor.func6({0x71cdee8, 0xc003c3c420}, {0x67f5d20, 0xc0013300c0}, 0x20?, 0xc0001808e0)
    /go/src/github.com/milvus-io/milvus/pkg/util/interceptor/cluster_interceptor.go:48 +0xc8 fp=0xc00129f848 sp=0xc00129f7f0 pc=0x5c091e8
github.com/milvus-io/milvus/internal/distributed/querynode.(*Server).startGrpcLoop.ChainUnaryServer.func8.1.1({0x71cdee8?, 0xc003c3c420?}, {0x67f5d20?, 0xc0013300c0?})
    /go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware@v1.3.0/chain.go:25 +0x34 fp=0xc00129f888 sp=0xc00129f848 pc=0x5c077f4
github.com/milvus-io/milvus/pkg/util/logutil.UnaryTraceLoggerInterceptor({0x71cdee8?, 0xc003c3c390?}, {0x67f5d20, 0xc0013300c0}, 0x18?, 0xc000180900)
    /go/src/github.com/milvus-io/milvus/pkg/util/logutil/grpc_interceptor.go:23 +0x43 fp=0xc00129f8b8 sp=0xc00129f888 pc=0x4256bc3
github.com/milvus-io/milvus/internal/distributed/querynode.(*Server).startGrpcLoop.ChainUnaryServer.func8.1.1({0x71cdee8?, 0xc003c3c390?}, {0x67f5d20?, 0xc0013300c0?})
    /go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware@v1.3.0/chain.go:25 +0x34 fp=0xc00129f8f8 sp=0xc00129f8b8 pc=0x5c077f4
github.com/milvus-io/milvus/internal/distributed/querynode.(*Server).startGrpcLoop.ChainUnaryServer.func8({0x71cdee8, 0xc003c3c390}, {0x67f5d20, 0xc0013300c0}, 0x1f54525?, 0x80?)
    /go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware@v1.3.0/chain.go:34 +0xb8 fp=0xc00129f950 sp=0xc00129f8f8 pc=0x5c07698
github.com/milvus-io/milvus/internal/proto/querypb._QueryNode_LoadSegments_Handler({0x6895700, 0xc00088ed80}, {0x71cdee8, 0xc003c3c390}, 0xc00355a480, 0xc001e75c50)
    /go/src/github.com/milvus-io/milvus/internal/proto/querypb/query_coord_grpc.pb.go:2039 +0x143 fp=0xc00129f9a0 sp=0xc00129f950 pc=0x2ea3c43
google.golang.org/grpc.(*Server).processUnaryRPC(0xc001f1c200, {0x71cdee8, 0xc003c3c210}, {0x71f4160, 0xc0018a4480}, 0xc002d84fc0, 0xc001e75dd0, 0x9c1aad8, 0x0)
    /go/pkg/mod/google.golang.org/grpc@v1.65.0/server.go:1379 +0xdf8 fp=0xc00129fda0 sp=0xc00129f9a0 pc=0x2741098
google.golang.org/grpc.(*Server).handleStream(0xc001f1c200, {0x71f4160, 0xc0018a4480}, 0xc002d84fc0)
    /go/pkg/mod/google.golang.org/grpc@v1.65.0/server.go:1790 +0xe8b fp=0xc00129ff78 sp=0xc00129fda0 pc=0x2745f6b
google.golang.org/grpc.(*Server).serveStreams.func2.1()
    /go/pkg/mod/google.golang.org/grpc@v1.65.0/server.go:1029 +0x8b fp=0xc00129ffe0 sp=0xc00129ff78 pc=0x273f10b
runtime.goexit({})
    /go/pkg/mod/golang.org/toolchain@v0.0.1-go1.22.0.linux-amd64/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc00129ffe8 sp=0xc00129ffe0 pc=0x1fc1941
created by google.golang.org/grpc.(*Server).serveStreams.func2 in goroutine 500
    /go/pkg/mod/google.golang.org/grpc@v1.65.0/server.go:1040 +0x125

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

log.log

Anything else?

This panic occurs when running all full-text search cases, and it's difficult to pinpoint which specific step triggers the issue.

zhuwenxing commented 1 month ago

/assign @zhengbuqian PTAL

aoiasd commented 1 month ago

Load some segment pkoracle and idforacle already exist cause this question. Because we will don't init bm25stats map if no info to load. Some sync segment or load index action may cause was this case. Will fix in https://github.com/milvus-io/milvus/pull/36959.

zhengbuqian commented 1 month ago

please verify, thanks!

zhuwenxing commented 1 month ago

verification is blocked by another panic issue https://github.com/milvus-io/milvus/issues/36992, waiting for the fix

zhuwenxing commented 1 month ago

not reproduced in master-346510e-20241021