milvus-io / pymilvus

Python SDK for Milvus.
Apache License 2.0
986 stars 314 forks source link

[Bug]: [benchmark] release collection stuck #1183

Closed wangting0128 closed 1 year ago

wangting0128 commented 1 year ago

Is there an existing issue for this?

Describe the bug

argo task:fouramf-cron-1665676800 test case: test_search_time

pymilvus:2.2.0.dev32

server:

NAME                                                              READY   STATUS      RESTARTS        AGE     IP             NODE         NOMINATED NODE   READINESS GATES
fouramf-cron-1665676800-84-1156-etcd-0                            1/1     Running     0               15h     10.104.9.107   4am-node14   <none>           <none>
fouramf-cron-1665676800-84-1156-milvus-standalone-59c655bflgsf4   1/1     Running     1 (14h ago)     15h     10.104.1.90    4am-node10   <none>           <none>
fouramf-cron-1665676800-84-1156-minio-65dcccc8cf-74vbp            1/1     Running     0               15h     10.104.9.98    4am-node14   <none>           <none>

client pod:fouramf-cron-1665676800 client log:

[2022-10-13 16:48:25,679 -  INFO - fouram]: [Base] Params of search: nq:1200, anns_field:float_vector, param:{'metric_type': 'L2', 'params': {'nprobe': 32}}, limit
:1000, expr:"None" (base.py:261)
[2022-10-13 16:48:25,878 - ERROR - fouram]: Traceback (most recent call last):
  File "/src/fouram/client/util/api_request.py", line 21, in inner_wrapper
    res = func(*args, **kwargs)
  File "/src/fouram/client/util/api_request.py", line 57, in api_request
    return func(*arg, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/pymilvus/orm/collection.py", line 717, in search
    res = conn.search(self._name, data, anns_field, param, limit, expr,
  File "/usr/local/lib/python3.8/dist-packages/pymilvus/decorators.py", line 113, in handler
    raise e
  File "/usr/local/lib/python3.8/dist-packages/pymilvus/decorators.py", line 109, in handler
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/pymilvus/decorators.py", line 139, in handler
    ret = func(self, *args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/pymilvus/decorators.py", line 89, in handler
    raise e
  File "/usr/local/lib/python3.8/dist-packages/pymilvus/decorators.py", line 51, in handler
    return func(self, *args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/pymilvus/client/grpc_handler.py", line 472, in search
    return self._execute_search_requests(requests, timeout, **_kwargs)
  File "/usr/local/lib/python3.8/dist-packages/pymilvus/client/grpc_handler.py", line 436, in _execute_search_requests
    raise pre_err
  File "/usr/local/lib/python3.8/dist-packages/pymilvus/client/grpc_handler.py", line 427, in _execute_search_requests
    raise MilvusException(response.status.error_code, response.status.reason)
pymilvus.exceptions.MilvusException: <MilvusException: (code=1, message=collection:fouram_JIpc7h9n or partition:[] not loaded into memory when search)>
 (api_request.py:35)
[2022-10-13 16:48:25,878 - ERROR - fouram]: (api_response) : <MilvusException: (code=1, message=collection:fouram_JIpc7h9n or partition:[] not loaded into memory w
hen search)> (api_request.py:36)
[2022-10-13 16:48:25,878 - ERROR - fouram]: [CheckFunc] search request check failed, response:<MilvusException: (code=1, message=collection:fouram_JIpc7h9n or part
ition:[] not loaded into memory when search)> (func_check.py:40)
[2022-10-13 16:48:25,878 - ERROR - fouram]: [Search] Search raise error:  (common_cases.py:401)
[2022-10-13 16:48:25,878 -  INFO - fouram]: [Base] Start clear collections (base.py:83)

gdb:

(gdb) py-bt
Traceback (most recent call first):
  File "/usr/lib/python3.8/threading.py", line 306, in wait
    gotit = waiter.acquire(True, timeout)
  File "/usr/local/lib/python3.8/dist-packages/grpc/_common.py", line 106, in _wait_once
    wait_fn(timeout=timeout)
  File "/usr/local/lib/python3.8/dist-packages/grpc/_common.py", line 141, in wait
    _wait_once(wait_fn, MAXIMUM_WAIT_TIMEOUT, spin_cb)
  File "/usr/local/lib/python3.8/dist-packages/grpc/_channel.py", line 989, in result
    self._request_serializer = request_serializer
  File "/usr/local/lib/python3.8/dist-packages/pymilvus/client/grpc_handler.py", line 714, in release_collection
    response = rf.result()
  File "/usr/local/lib/python3.8/dist-packages/pymilvus/decorators.py", line 51, in handler
    return func(self, *args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/pymilvus/decorators.py", line 139, in handler
    ret = func(self, *args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/pymilvus/decorators.py", line 109, in handler
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/pymilvus/orm/collection.py", line 501, in release
    conn.release_collection(self._name, timeout=timeout, **kwargs)
  File "/src/fouram/client/util/api_request.py", line 313, in api_request
  File "/src/fouram/client/util/api_request.py", line 21, in inner_wrapper
    res = func(*args, **kwargs)
  File "/src/fouram/client/client_base/collection_wrapper.py", line 97, in release
    res, check = api_request([self.collection.release], **kwargs)
  File "/src/fouram/client/cases/base.py", line 84, in clear_collections
    self.collection_wrap.release()
  File "/src/fouram/client/cases/common_cases.py", line 1959, in scene_search
  <built-in method next of module object at remote 0x7f3c3ae990e0>
  File "/src/fouram/workflow/performance_template.py", line 1608, in serial_template
  File "/src/fouram/testcases/benchmark/performance.py", line 722, in test_search_time
  File "/usr/local/lib/python3.8/dist-packages/_pytest/python.py", line 183, in pytest_pyfunc_call
    result = testfunction(**testargs)
  File "/usr/local/lib/python3.8/dist-packages/pluggy/callers.py", line 443, in _multicall
  File "/usr/local/lib/python3.8/dist-packages/pluggy/manager.py", line 340, in <lambda>

  File "/usr/local/lib/python3.8/dist-packages/pluggy/manager.py", line 93, in _hookexec
    return self._inner_hookexec(hook, methods, kwargs)
  File "/usr/local/lib/python3.8/dist-packages/pluggy/hooks.py", line 1054, in __call__
  File "/usr/local/lib/python3.8/dist-packages/_pytest/python.py", line 1641, in runtest
    self.ihook.pytest_pyfunc_call(pyfuncitem=self)
  File "/usr/local/lib/python3.8/dist-packages/_pytest/runner.py", line 162, in pytest_runtest_call
    item.runtest()
  File "/usr/local/lib/python3.8/dist-packages/pluggy/callers.py", line 443, in _multicall
  File "/usr/local/lib/python3.8/dist-packages/pluggy/manager.py", line 340, in <lambda>

  File "/usr/local/lib/python3.8/dist-packages/pluggy/manager.py", line 93, in _hookexec
    return self._inner_hookexec(hook, methods, kwargs)
  File "/usr/local/lib/python3.8/dist-packages/pluggy/hooks.py", line 1054, in __call__
  File "/usr/local/lib/python3.8/dist-packages/_pytest/runner.py", line 255, in <lambda>
    lambda: ihook(item=item, **kwds), when=when, reraise=reraise
  File "/usr/local/lib/python3.8/dist-packages/_pytest/runner.py", line 311, in from_call
    result: Optional[TResult] = func()
  File "/usr/local/lib/python3.8/dist-packages/_pytest/runner.py", line 510, in call_runtest_hook
  File "/usr/local/lib/python3.8/dist-packages/_pytest/runner.py", line 215, in call_and_report
    call = call_runtest_hook(item, when, **kwds)
  File "/usr/local/lib/python3.8/dist-packages/_pytest/runner.py", line 126, in runtestprotocol
    reports.append(call_and_report(item, "call", log))
  File "/usr/local/lib/python3.8/dist-packages/_pytest/runner.py", line 109, in pytest_runtest_protocol
    runtestprotocol(item, nextitem=nextitem)
  File "/usr/local/lib/python3.8/dist-packages/pluggy/callers.py", line 443, in _multicall
  File "/usr/local/lib/python3.8/dist-packages/pluggy/manager.py", line 340, in <lambda>

  File "/usr/local/lib/python3.8/dist-packages/pluggy/manager.py", line 93, in _hookexec
    return self._inner_hookexec(hook, methods, kwargs)
  File "/usr/local/lib/python3.8/dist-packages/pluggy/hooks.py", line 1054, in __call__
  File "/usr/local/lib/python3.8/dist-packages/_pytest/main.py", line 860, in pytest_runtestloop
    fspath = invocation_path / strpath
  File "/usr/local/lib/python3.8/dist-packages/pluggy/callers.py", line 443, in _multicall
  File "/usr/local/lib/python3.8/dist-packages/pluggy/manager.py", line 340, in <lambda>

  File "/usr/local/lib/python3.8/dist-packages/pluggy/manager.py", line 93, in _hookexec
    return self._inner_hookexec(hook, methods, kwargs)
  File "/usr/local/lib/python3.8/dist-packages/pluggy/hooks.py", line 1054, in __call__
---Type <return> to continue, or q <return> to quit---
  File "/usr/local/lib/python3.8/dist-packages/_pytest/main.py", line 323, in _main
    config.hook.pytest_runtestloop(session=session)
  File "/usr/local/lib/python3.8/dist-packages/_pytest/main.py", line 269, in wrap_session
    session.exitstatus = doit(config, session) or 0
  File "/usr/local/lib/python3.8/dist-packages/_pytest/main.py", line 316, in pytest_cmdline_main
    return wrap_session(config, _main)
  File "/usr/local/lib/python3.8/dist-packages/pluggy/callers.py", line 443, in _multicall
  File "/usr/local/lib/python3.8/dist-packages/pluggy/manager.py", line 340, in <lambda>

  File "/usr/local/lib/python3.8/dist-packages/pluggy/manager.py", line 93, in _hookexec
    return self._inner_hookexec(hook, methods, kwargs)
  File "/usr/local/lib/python3.8/dist-packages/pluggy/hooks.py", line 1054, in __call__
  File "/usr/local/lib/python3.8/dist-packages/_pytest/config/__init__.py", line 1186, in main

  File "/usr/local/lib/python3.8/dist-packages/_pytest/config/__init__.py", line 185, in console_main
    code = main()
  File "/usr/local/lib/python3.8/dist-packages/pytest/__main__.py", line 5, in <module>
    raise SystemExit(pytest.console_main())
  <built-in method exec of module object at remote 0x7f3c3ae990e0>
  File "/usr/lib/python3.8/runpy.py", line 341, in _run_code
  File "/usr/lib/python3.8/runpy.py", line 448, in _run_module_as_main
(gdb) 
(gdb) py-list
 301                if timeout is None:
 302                    waiter.acquire()
 303                    gotit = True
 304                else:
 305                    if timeout > 0:
>306                        gotit = waiter.acquire(True, timeout)
 307                    else:
 308                        gotit = waiter.acquire(False)
 309                return gotit
 310            finally:
 311                self._acquire_restore(saved_state)

Expected Behavior

No response

Steps/Code To Reproduce behavior

1、create collection
2、build index of IVF_FLAT
3、insert 50m vectors
4、flush collection
5、build index with the same params
6、load collection
7、search raise error
8、release collection 《- stuck

Environment details

- Hardware/Softward conditions (OS, CPU, GPU, Memory):
- Method of installation (Docker, or from source):
- Milvus version (v0.3.1, or v0.4.0):
- Milvus configuration (Settings you made in `server_config.yaml`):

Anything else?

No response

XuanYang-cn commented 1 year ago

reopen if reproduced