Closed liuchuan01 closed 3 months ago
这个问题比较奇怪,脱离LangChain-chatchat的时候,我这样是没问题的,显存完全够用:
emb_model = HuggingFaceEmbeddings(model_name=hf_model_path, model_kwargs=model_kwargs)
vectorstore = FAISS.from_documents(documents, emb_model)
观察上面异常的点的时候,nvidia-smi显存也显示一直正常,在4G左右,会忽然飙升到10G又掉落回3G。然后就突然出现上面的错误了
确认过了不是框架的原因,BGE-m3存在输入字数越长,使用显存越大的情况。 有一个异常分段13000多字所以....
问题描述 / Problem Description bge-m3不兼容 / 显存异常分配
和issue #4101 相似 复现问题的步骤 / Steps to Reproduce
预期的结果 / Expected Result 正常完成
实际结果 / Actual Result 2024-08-06 13:58:26,462 - embeddings_api.py[line:40] - ERROR: CUDA out of memory. Tried to allocate 18.88 GiB. GPU 0 has a total capacty of 23.48 GiB of which 1.53 GiB is free. Including non-PyTorch memory, this process has 21.92 GiB memory in use. Of the allocated memory 21.61 GiB is allocated by PyTorch, and 14.47 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF ERROR: Exception in ASGI application Traceback (most recent call last): File "/home/star/miniconda3/envs/langchain-test/lib/python3.11/site-packages/sse_starlette/sse.py", line 269, in call await wrap(partial(self.listen_for_disconnect, receive)) File "/home/star/miniconda3/envs/langchain-test/lib/python3.11/site-packages/sse_starlette/sse.py", line 258, in wrap await func() File "/home/star/miniconda3/envs/langchain-test/lib/python3.11/site-packages/sse_starlette/sse.py", line 215, in listen_for_disconnect message = await receive() ^^^^^^^^^^^^^^^ File "/home/star/miniconda3/envs/langchain-test/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 568, in receive await self.message_event.wait() File "/home/star/miniconda3/envs/langchain-test/lib/python3.11/asyncio/locks.py", line 213, in wait await fut asyncio.exceptions.CancelledError: Cancelled by cancel scope 713285512450
During handling of the above exception, another exception occurred:
环境信息 / Environment Information