milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
30.3k stars 2.91k forks source link

can not load collection[Bug]: #33181

Open yuege613 opened 5 months ago

yuege613 commented 5 months ago

Is there an existing issue for this?

Environment

- Milvus version: 2.4.1
- Deployment mode(standalone or cluster): standalone
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2): 2.4.1
- OS(Ubuntu or CentOS): ubuntu
- CPU/Memory: 
- GPU: 3090
- Others:

Current Behavior

[2024/05/20 07:51:04.845 +00:00] [WARN] [rootcoord/quota_center.go:1327] ["cannot find db id for collection"] [collection=449779225766320681] [2024/05/20 07:51:04.845 +00:00] [WARN] [rootcoord/quota_center.go:1327] ["cannot find db id for collection"] [collection=449779225760902044] [2024/05/20 07:51:04.845 +00:00] [WARN] [rootcoord/quota_center.go:1327] ["cannot find db id for collection"] [collection=449779225760904122]

File ~/miniconda3/lib/python3.12/site-packages/pymilvus/client/utils.py:62, in check_status(status) 60 def check_status(status: Status): 61 if status.code != 0 or status.error_code != 0: ---> 62 raise MilvusException(status.code, status.reason, status.error_code)

MilvusException: <MilvusException: (code=101, message=collection not loaded[collection=449888229865560632])>

Expected Behavior

sess.load()

Steps To Reproduce

1. insert 4G or more data
2. restart the server[sudden]
3. restart milvus-stanalone docker server
4. connect to collection

Milvus Log

[2024/05/20 07:51:04.845 +00:00] [WARN] [rootcoord/quota_center.go:1327] ["cannot find db id for collection"] [collection=449779225766320681] [2024/05/20 07:51:04.845 +00:00] [WARN] [rootcoord/quota_center.go:1327] ["cannot find db id for collection"] [collection=449779225760902044] [2024/05/20 07:51:04.845 +00:00] [WARN] [rootcoord/quota_center.go:1327] ["cannot find db id for collection"] [collection=449779225760904122] [2024/05/20 07:51:04.845 +00:00] [WARN] [rootcoord/quota_center.go:1327] ["cannot find db id for collection"] [collection=449779225760912000] [2024/05/20 07:51:04.845 +00:00] [WARN] [rootcoord/quota_center.go:1327] ["cannot find db id for collection"] [collection=449779225760930336] [2024/05/20 07:51:04.845 +00:00] [WARN] [rootcoord/quota_center.go:1327] ["cannot find db id for collection"] [collection=449779225769201068] [2024/05/20 07:51:04.845 +00:00] [WARN] [rootcoord/quota_center.go:1327] ["cannot find db id for collection"] [collection=449816892734519500] [2024/05/20 07:51:04.845 +00:00] [WARN] [rootcoord/quota_center.go:1327] ["cannot find db id for collection"] [collection=449779225767861501] [2024/05/20 07:51:04.845 +00:00] [WARN] [rootcoord/quota_center.go:1327] ["cannot find db id for collection"] [collection=449779225760900331] [2024/05/20 07:51:04.845 +00:00] [WARN] [rootcoord/quota_center.go:1327] ["cannot find db id for collection"] [collection=449779225760907368] [2024/05/20 07:51:04.845 +00:00] [WARN] [rootcoord/quota_center.go:1327] ["cannot find db id for collection"] [collection=449779225760924209] [2024/05/20 07:51:04.845 +00:00] [WARN] [rootcoord/quota_center.go:1327] ["cannot find db id for collection"] [collection=449779225760933362] [2024/05/20 07:51:04.845 +00:00] [WARN] [rootcoord/quota_center.go:1327] ["cannot find db id for collection"] [collection=449779225766325291] [2024/05/20 07:51:04.845 +00:00] [WARN] [rootcoord/quota_center.go:1327] ["cannot find db id for collection"] [collection=449779225769200446] [2024/05/20 07:51:04.845 +00:00] [WARN] [rootcoord/quota_center.go:1327] ["cannot find db id for collection"] [collection=449779225760892437] [2024/05/20 07:51:04.846 +00:00] [WARN] [rootcoord/quota_center.go:1327] ["cannot find db id for collection"] [collection=449779225760949676] [2024/05/20 07:51:04.846 +00:00] [WARN] [rootcoord/quota_center.go:1327] ["cannot find db id for collection"] [collection=449779225760975108] [2024/05/20 07:51:04.846 +00:00] [WARN] [rootcoord/quota_center.go:1327] ["cannot find db id for collection"] [collection=449779225771645770] [2024/05/20 07:51:04.846 +00:00] [WARN] [rootcoord/quota_center.go:1327] ["cannot find db id for collection"] [collection=449779225760951457] [2024/05/20 07:51:04.846 +00:00] [WARN] [rootcoord/quota_center.go:1327] ["cannot find db id for collection"] [collection=449779225760909599] [2024/05/20 07:51:04.846 +00:00] [WARN] [rootcoord/quota_center.go:1327] ["cannot find db id for collection"] [collection=449779225760937087]

File ~/miniconda3/lib/python3.12/site-packages/pymilvus/client/utils.py:62, in check_status(status) 60 def check_status(status: Status): 61 if status.code != 0 or status.error_code != 0: ---> 62 raise MilvusException(status.code, status.reason, status.error_code)

MilvusException: <MilvusException: (code=101, message=collection not loaded[collection=449888229865560632])>

Anything else?

no

xiaofan-luan commented 5 months ago

/assign @SimFG

SimFG commented 5 months ago

@yuege613 What python milvus api did you use to get this error? Is it client v2 or something else?

yuege613 commented 5 months ago

@SimFG

 def initialize(self):
        try:
            connections.connect(**self.conn_config, timeout=self.client_timeout)  # timeout=3 [cannot set]
            if utility.has_collection(self.kb_name, timeout=self.client_timeout):
                self.sess = Collection(self.kb_name)
                logger.info(f'collection {self.kb_name} exists')
            else:
                schema = CollectionSchema(self.fields)
                logger.info(f'create collection {self.kb_name} {schema}')
                self.sess = Collection(self.kb_name, schema)
                self.sess.create_index(field_name="embedding", index_params=self.create_params)
                logger.info(f"create index for {self.kb_name} done.")
            self.sess.load()
            logger.info(f"Milvus for collection {self.kb_name} initialize done.")
        except Exception as e:
            logger.error(f"Milvus client initialize error: {e}")
 ---------
 self.sess.load()  this; 
 without any response for long time( may be 30m or more) and then raise thre error mesg:
 File ~/miniconda3/lib/python3.12/site-packages/pymilvus/client/utils.py:62, in check_status(status)

60 def check_status(status: Status): 61 if status.code != 0 or status.error_code != 0: ---> 62 raise MilvusException(status.code, status.reason, status.error_code)

MilvusException: <MilvusException: (code=101, message=collection not loaded[collection=449888229865560632])>

SimFG commented 5 months ago

@yuege613 Can you provide a complete log? The logs involved in the issue are not the core cause of the problem, and some pre-existing issues caused the output of these logs.

yuege613 commented 5 months ago

@SimFG

--> 182 return func(self, *args, **kwargs)

File ~/miniconda3/lib/python3.12/site-packages/pymilvus/decorators.py:122, in retry_on_rpc_failure.<locals>.wrapper.<locals>.handler(*args, **kwargs)
    120         back_off = min(back_off * back_off_multiplier, max_back_off)
    121     else:
--> 122         raise e from e
    123 except Exception as e:
    124     raise e from e

File ~/miniconda3/lib/python3.12/site-packages/pymilvus/decorators.py:87, in retry_on_rpc_failure.<locals>.wrapper.<locals>.handler(*args, **kwargs)
     85 while True:
     86     try:
---> 87         return func(*args, **kwargs)
     88     except grpc.RpcError as e:
     89         # Do not retry on these codes
     90         if e.code() in IGNORE_RETRY_CODES:

File ~/miniconda3/lib/python3.12/site-packages/pymilvus/client/grpc_handler.py:1267, in GrpcHandler.get_loading_progress(self, collection_name, partition_names, timeout, is_refresh)
   1265 request = Prepare.get_loading_progress(collection_name, partition_names)
   1266 response = self._stub.GetLoadingProgress.future(request, timeout=timeout).result()
-> 1267 check_status(response.status)
   1268 if is_refresh:
   1269     return response.refresh_progress

File ~/miniconda3/lib/python3.12/site-packages/pymilvus/client/utils.py:62, in check_status(status)
     60 def check_status(status: Status):
     61     if status.code != 0 or status.error_code != 0:
---> 62         raise MilvusException(status.code, status.reason, status.error_code)

MilvusException: <MilvusException: (code=101, message=collection not loaded[collection=449888229865560632])>

I don’t have the complete error log, as I only copied this much. When I reran it last night, it kept hanging there without responding.

SimFG commented 5 months ago

@yuege613 How did you deploy milvus, docker or helm? You can try to upload the log file.

yuege613 commented 5 months ago

milvus_part.log @SimFG docker

SimFG commented 5 months ago

@yuege613 ok, I will try to analyze it. thanks your log file

SimFG commented 5 months ago

@yuege613 How much data does your milvus service contain? Through the log provided, I found that there are 742 collections. What is your milvus running configuration like? I suspect that the machine resources for the docker container are insufficient.

yuege613 commented 5 months ago

@SimFG The size of the volumes is only 2.6G(Many collections have relatively small amounts of data.) The configuration of Milvus uses the default settings in the standalone_embed.sh script, except that the quota-backend-bytes has been increased to 94294967296.

yuege613 commented 5 months ago

@SimFG Does Milvus have any limitations on the number of collections?

SimFG commented 5 months ago

The current number of collections does not exceed the limit. When you start milvus, are the computer's memory and cpu usage at a high level?

What is your computer configuration roughly like, and what is the estimated number of data rows in all collections. You can try to delete most of the collection, or restart an empty milvus, and then insert data. Note the number of inserted data rows and see if it can be loaded. If possible, create a new collection, insert a large amount of data, and try again to load.