Closed ANemo-yj closed 4 months ago
请问您使用的是哪个向量数据库呢?每个向量数据库都有对应的删除方法,比如faiss的。
请问您使用的是哪个向量数据库呢?每个向量数据库都有对应的删除方法,比如faiss的。
我是根据《Windows本地搭建语义检索系统》搭建的 应该就是faiss数据库的,删除了后检索还能搜到吗?
文档删除后就搜不到了,可以试一下
文档删除后就搜不到了,可以试一下
不会啊 我在本地上传的文件夹下删掉了文章 依然能够检索出来,是因为我找的文件夹不对?还是没有删除干净?
这里有remove的操作,你确定?
我掉这个接口一直显示出错啊,我就手动删除文档,请问是怎么使用的?参数ID这些怎么找?
请问您的pipelines版本是哪个?
请问您的pipelines版本是哪个?
你好paddle信息如下:
>>> document_store.get_embedding_count()
2
>>> document_store.delete_all_documents()
WARNING - pipelines.document_stores.faiss - DEPRECATION WARNINGS:
1. delete_all_documents() method is deprecated, please use delete_documents method
>>> document_store.delete_documents()
>>> document_store.get_embedding_count()
0
我给您提供一下我的环境:
absl-py==2.1.0
accessible-pygments==0.0.4
addict==2.4.0
aiofiles==23.2.1
aiohttp==3.8.4
aiosignal==1.3.1
aistudio-sdk==0.1.7
alabaster==0.7.16
aliyun-python-sdk-core==2.15.0
aliyun-python-sdk-kms==2.16.2
altair==5.2.0
annotated-types==0.6.0
anyio==3.7.1
appdirs==1.4.4
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
astor==0.8.1
asttokens==2.4.1
async-timeout==4.0.3
asyncio-atexit==1.0.1
attrdict==2.0.1
attrs==23.2.0
audioread==3.0.1
azure-core==1.29.1
azure-storage-blob==12.19.1
Babel==2.14.0
bce-python-sdk==0.9.4
beautifulsoup4==4.12.3
black==23.3.0
blinker==1.7.0
blis==0.7.11
boilerpy3==1.0.7
Brotli==1.1.0
cachetools==5.3.3
cairocffi==1.6.1
CairoSVG==2.7.1
catalogue==2.0.10
certifi==2024.2.2
cffi==1.16.0
cfgv==3.4.0
charset-normalizer==3.3.2
click==8.0.0
cloudpathlib==0.16.0
cloudpickle==3.0.0
colorama==0.4.6
colorlog==6.8.2
commonmark==0.9.1
confection==0.1.4
contourpy==1.2.0
coverage==7.4.4
crcmod==1.7
cryptography==42.0.5
cssselect==1.2.0
cssselect2==0.7.0
cssutils==2.10.2
cuda-python==12.4.0
cupy-cuda116==10.6.0
cycler==0.12.1
cymem==2.0.8
Cython==3.0.10
data==0.4
dataclasses-json==0.6.4
datasets==2.18.0
decorator==5.1.1
defusedxml==0.7.1
deploy==1.9.1
dill==0.3.4
distlib==0.3.8
docker-pycreds==0.4.0
docutils==0.20.1
einops==0.7.0
elasticsearch==7.11.0
emoji==2.11.0
environs==9.5.0
erniebot==0.5.2
erniebot_agent==0.5.0
et-xmlfile==1.1.0
Events==0.5
exceptiongroup==1.2.0
execnet==2.0.2
executing==2.0.1
faiss-cpu==1.7.4
fast-tokenizer-python==1.0.2
fastapi==0.103.2
fastrlock==0.8.2
ffmpy==0.3.2
filelock==3.13.1
fire==0.6.0
flake8==5.0.4
Flask==2.2.5
flask-babel==4.0.0
fonttools==4.49.0
frozenlist==1.4.1
fsspec==2024.2.0
ftfy==6.2.0
funcsigs==1.0.2
future==1.0.0
gast==0.5.4
gevent==24.2.1
geventhttpclient==2.0.2
gitdb==4.0.11
GitPython==3.1.43
gradio==4.19.2
gradio_client==0.10.1
greenlet==3.0.3
grpcio==1.51.3
h11==0.12.0
h5py==3.10.0
html5lib==1.1
httpcore==0.15.0
httpx==0.25.1
huggingface-hub==0.21.1
hyperopt==0.2.7
identify==2.5.35
idna==3.6
imageio==2.34.0
imagesize==1.4.1
imgaug==0.4.0
importlib_metadata==7.1.0
importlib_resources==6.1.2
iniconfig==2.0.0
ipython==8.22.1
isodate==0.6.1
isort==5.11.5
itsdangerous==2.1.2
jedi==0.19.1
jieba==0.42.1
Jinja2==3.1.3
jmespath==0.10.0
joblib==1.3.2
jsonlines==4.0.0
jsonpatch==1.33
jsonpointer==2.4
jsonschema==4.21.1
jsonschema-path==0.3.2
jsonschema-specifications==2023.12.1
kiwisolver==1.4.5
langchain==0.1.9
langchain-community==0.0.24
langchain-core==0.1.27
langcodes==3.3.0
langdetect==1.0.9
langsmith==0.1.10
lazy-object-proxy==1.10.0
lazy_loader==0.3
librosa==0.10.1
llvmlite==0.42.0
lmdb==1.4.1
loguru==0.5.3
lxml==5.2.0
Markdown==3.5.2
markdown-it-py==3.0.0
MarkupSafe==2.1.5
marshmallow==3.21.0
matplotlib==3.8.3
matplotlib-inline==0.1.6
mccabe==0.7.0
mdurl==0.1.2
minio==7.2.5
mmh3==4.1.0
modelscope==1.13.3
more-itertools==10.2.0
mpmath==1.3.0
msgpack==1.0.8
multidict==6.0.5
multiprocess==0.70.12.2
murmurhash==1.0.10
mypy==1.6.1
mypy-extensions==1.0.0
networkx==3.2.1
nltk==3.8.1
nodeenv==1.8.0
numba==0.59.1
numpy==1.23.5
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu12==2.19.3
nvidia-nvjitlink-cu12==12.4.127
nvidia-nvtx-cu12==12.1.105
onnx==1.16.0
openapi-schema-validator==0.6.2
openapi-spec-validator==0.7.1
opencv-contrib-python==4.6.0.66
opencv-contrib-python-headless==4.9.0.80
opencv-python==4.6.0.66
opencv-python-headless==4.9.0.80
openpyxl==3.1.2
opt-einsum==3.3.0
orjson==3.9.15
oss2==2.18.4
packaging==23.2
paddle-pipelines==0.6.2
paddle2onnx==1.1.0
paddlefsl==1.1.0
paddlenlp==2.6.1
paddleocr==2.6.1.3
paddlepaddle==2.6.1
pandas==2.2.1
parameterized==0.9.0
parso==0.8.3
pathable==0.4.3
pathspec==0.12.1
pdf2docx==0.5.8
pdf2image==1.17.0
pdfminer.six==20231228
pdfplumber==0.11.0
pexpect==4.9.0
pillow==10.2.0
platformdirs==4.2.0
pluggy==1.4.0
pooch==1.8.1
pre-commit==3.7.0
premailer==3.10.0
preshed==3.0.9
prompt-toolkit==3.0.43
protobuf==3.20.2
psutil==5.9.8
ptyprocess==0.7.0
pure-eval==0.2.2
py4j==0.10.9.7
pyarrow==15.0.2
pyarrow-hotfix==0.6
pybind11==2.12.0
pyclipper==1.3.0.post5
pycodestyle==2.9.1
pycparser==2.21
pycryptodome==3.20.0
pydantic==1.10.11
pydantic_core==2.16.3
pydata-sphinx-theme==0.15.2
pydub==0.25.1
pyflakes==2.5.0
Pygments==2.17.2
pymilvus==2.4.0
PyMuPDF==1.20.2
pyparsing==3.1.1
pypdf==4.2.0
pypdfium2==4.29.0
pyphen==0.14.0
pytest==8.1.1
pytest-cov==5.0.0
pytest-timeout==2.3.1
pytest-xdist==3.5.0
python-dateutil==2.8.2
python-docx==1.1.0
python-dotenv==1.0.1
python-multipart==0.0.9
python-rapidjson==1.16
pytz==2024.1
PyYAML==6.0.1
rapidfuzz==3.7.0
rarfile==4.1
ray==2.5.1
readthedocs-sphinx-search==0.3.2
recommonmark==0.7.1
referencing==0.31.1
regex==2024.4.16
requests==2.31.0
rfc3339-validator==0.1.4
rich==13.7.0
rouge==1.0.1
rpds-py==0.18.0
ruff==0.2.2
sacremoses==0.1.1
safetensors==0.4.2
scikit-image==0.22.0
scikit-learn==1.4.1.post1
scipy==1.9.1
semantic-version==2.10.0
sentencepiece==0.2.0
sentry-sdk==1.44.0
seqeval==1.2.2
setproctitle==1.3.3
shapely==2.0.3
shellingham==1.5.4
simplejson==3.19.2
six==1.16.0
smart-open==6.4.0
smmap==5.0.1
sniffio==1.3.1
snowballstemmer==2.2.0
sortedcontainers==2.4.0
soundfile==0.12.1
soupsieve==2.5
soxr==0.3.7
spacy==3.7.4
spacy-legacy==3.0.12
spacy-loggers==1.0.5
Sphinx==7.2.6
sphinx-book-theme==1.1.2
sphinx-copybutton==0.5.2
sphinx-markdown-tables==0.0.17
sphinx-rtd-theme==2.0.0
sphinxcontrib-applehelp==1.0.8
sphinxcontrib-devhelp==1.0.6
sphinxcontrib-htmlhelp==2.0.5
sphinxcontrib-jquery==4.1
sphinxcontrib-jsmath==1.0.1
sphinxcontrib-qthelp==1.0.7
sphinxcontrib-serializinghtml==1.1.10
SQLAlchemy==1.4.52
SQLAlchemy-Utils==0.41.2
srsly==2.4.8
sseclient-py==1.7.2
stack-data==0.6.3
starlette==0.27.0
sympy==1.12
tenacity==8.2.3
tensorboard==2.16.2
tensorboard-data-server==0.7.2
tensorboardX==2.6.2.2
termcolor==2.4.0
thinc==8.2.3
threadpoolctl==3.3.0
tifffile==2024.2.12
tiktoken==0.6.0
tinycss2==1.2.1
tokenize-rt==5.2.0
tokenizers==0.15.2
tomli==2.0.1
tomlkit==0.12.0
tool-helpers==0.1.1
toolz==0.12.1
torch==2.2.2
tqdm==4.66.2
traitlets==5.14.1
transformers==4.39.3
triton==2.2.0
tritonclient==2.41.1
typer==0.9.0
types-beautifulsoup4==4.12.0.20240106
types-html5lib==1.1.11.20240228
types-PyYAML==6.0.12.12
types-requests==2.31.0.2
types-urllib3==1.26.25.14
typing-inspect==0.9.0
typing_extensions==4.5.0
tzdata==2024.1
ujson==5.9.0
Unidecode==1.3.8
urllib3==1.26.2
uvicorn==0.27.1
virtualenv==20.25.1
visualdl==2.5.3
wandb==0.16.5
wasabi==1.1.2
wcwidth==0.2.13
weasel==0.3.4
WeasyPrint==52.5
webencodings==0.5.1
websockets==11.0.3
Werkzeug==3.0.1
wget==3.2
wordcloud==1.8.2.2
xxhash==3.4.1
yacs==0.1.8
yapf==0.40.2
yarl==1.9.4
zhon==2.0.2
zipp==3.18.1
zope.event==5.0
zope.interface==6.3
>>> document_store.get_embedding_count() 2 >>> document_store.delete_all_documents() WARNING - pipelines.document_stores.faiss - DEPRECATION WARNINGS: 1. delete_all_documents() method is deprecated, please use delete_documents method >>> document_store.delete_documents() >>> document_store.get_embedding_count() 0
我给您提供一下我的环境:
absl-py==2.1.0 accessible-pygments==0.0.4 addict==2.4.0 aiofiles==23.2.1 aiohttp==3.8.4 aiosignal==1.3.1 aistudio-sdk==0.1.7 alabaster==0.7.16 aliyun-python-sdk-core==2.15.0 aliyun-python-sdk-kms==2.16.2 altair==5.2.0 annotated-types==0.6.0 anyio==3.7.1 appdirs==1.4.4 argon2-cffi==23.1.0 argon2-cffi-bindings==21.2.0 astor==0.8.1 asttokens==2.4.1 async-timeout==4.0.3 asyncio-atexit==1.0.1 attrdict==2.0.1 attrs==23.2.0 audioread==3.0.1 azure-core==1.29.1 azure-storage-blob==12.19.1 Babel==2.14.0 bce-python-sdk==0.9.4 beautifulsoup4==4.12.3 black==23.3.0 blinker==1.7.0 blis==0.7.11 boilerpy3==1.0.7 Brotli==1.1.0 cachetools==5.3.3 cairocffi==1.6.1 CairoSVG==2.7.1 catalogue==2.0.10 certifi==2024.2.2 cffi==1.16.0 cfgv==3.4.0 charset-normalizer==3.3.2 click==8.0.0 cloudpathlib==0.16.0 cloudpickle==3.0.0 colorama==0.4.6 colorlog==6.8.2 commonmark==0.9.1 confection==0.1.4 contourpy==1.2.0 coverage==7.4.4 crcmod==1.7 cryptography==42.0.5 cssselect==1.2.0 cssselect2==0.7.0 cssutils==2.10.2 cuda-python==12.4.0 cupy-cuda116==10.6.0 cycler==0.12.1 cymem==2.0.8 Cython==3.0.10 data==0.4 dataclasses-json==0.6.4 datasets==2.18.0 decorator==5.1.1 defusedxml==0.7.1 deploy==1.9.1 dill==0.3.4 distlib==0.3.8 docker-pycreds==0.4.0 docutils==0.20.1 einops==0.7.0 elasticsearch==7.11.0 emoji==2.11.0 environs==9.5.0 erniebot==0.5.2 erniebot_agent==0.5.0 et-xmlfile==1.1.0 Events==0.5 exceptiongroup==1.2.0 execnet==2.0.2 executing==2.0.1 faiss-cpu==1.7.4 fast-tokenizer-python==1.0.2 fastapi==0.103.2 fastrlock==0.8.2 ffmpy==0.3.2 filelock==3.13.1 fire==0.6.0 flake8==5.0.4 Flask==2.2.5 flask-babel==4.0.0 fonttools==4.49.0 frozenlist==1.4.1 fsspec==2024.2.0 ftfy==6.2.0 funcsigs==1.0.2 future==1.0.0 gast==0.5.4 gevent==24.2.1 geventhttpclient==2.0.2 gitdb==4.0.11 GitPython==3.1.43 gradio==4.19.2 gradio_client==0.10.1 greenlet==3.0.3 grpcio==1.51.3 h11==0.12.0 h5py==3.10.0 html5lib==1.1 httpcore==0.15.0 httpx==0.25.1 huggingface-hub==0.21.1 hyperopt==0.2.7 identify==2.5.35 idna==3.6 imageio==2.34.0 imagesize==1.4.1 imgaug==0.4.0 importlib_metadata==7.1.0 importlib_resources==6.1.2 iniconfig==2.0.0 ipython==8.22.1 isodate==0.6.1 isort==5.11.5 itsdangerous==2.1.2 jedi==0.19.1 jieba==0.42.1 Jinja2==3.1.3 jmespath==0.10.0 joblib==1.3.2 jsonlines==4.0.0 jsonpatch==1.33 jsonpointer==2.4 jsonschema==4.21.1 jsonschema-path==0.3.2 jsonschema-specifications==2023.12.1 kiwisolver==1.4.5 langchain==0.1.9 langchain-community==0.0.24 langchain-core==0.1.27 langcodes==3.3.0 langdetect==1.0.9 langsmith==0.1.10 lazy-object-proxy==1.10.0 lazy_loader==0.3 librosa==0.10.1 llvmlite==0.42.0 lmdb==1.4.1 loguru==0.5.3 lxml==5.2.0 Markdown==3.5.2 markdown-it-py==3.0.0 MarkupSafe==2.1.5 marshmallow==3.21.0 matplotlib==3.8.3 matplotlib-inline==0.1.6 mccabe==0.7.0 mdurl==0.1.2 minio==7.2.5 mmh3==4.1.0 modelscope==1.13.3 more-itertools==10.2.0 mpmath==1.3.0 msgpack==1.0.8 multidict==6.0.5 multiprocess==0.70.12.2 murmurhash==1.0.10 mypy==1.6.1 mypy-extensions==1.0.0 networkx==3.2.1 nltk==3.8.1 nodeenv==1.8.0 numba==0.59.1 numpy==1.23.5 nvidia-cublas-cu12==12.1.3.1 nvidia-cuda-cupti-cu12==12.1.105 nvidia-cuda-nvrtc-cu12==12.1.105 nvidia-cuda-runtime-cu12==12.1.105 nvidia-cudnn-cu12==8.9.2.26 nvidia-cufft-cu12==11.0.2.54 nvidia-curand-cu12==10.3.2.106 nvidia-cusolver-cu12==11.4.5.107 nvidia-cusparse-cu12==12.1.0.106 nvidia-nccl-cu12==2.19.3 nvidia-nvjitlink-cu12==12.4.127 nvidia-nvtx-cu12==12.1.105 onnx==1.16.0 openapi-schema-validator==0.6.2 openapi-spec-validator==0.7.1 opencv-contrib-python==4.6.0.66 opencv-contrib-python-headless==4.9.0.80 opencv-python==4.6.0.66 opencv-python-headless==4.9.0.80 openpyxl==3.1.2 opt-einsum==3.3.0 orjson==3.9.15 oss2==2.18.4 packaging==23.2 paddle-pipelines==0.6.2 paddle2onnx==1.1.0 paddlefsl==1.1.0 paddlenlp==2.6.1 paddleocr==2.6.1.3 paddlepaddle==2.6.1 pandas==2.2.1 parameterized==0.9.0 parso==0.8.3 pathable==0.4.3 pathspec==0.12.1 pdf2docx==0.5.8 pdf2image==1.17.0 pdfminer.six==20231228 pdfplumber==0.11.0 pexpect==4.9.0 pillow==10.2.0 platformdirs==4.2.0 pluggy==1.4.0 pooch==1.8.1 pre-commit==3.7.0 premailer==3.10.0 preshed==3.0.9 prompt-toolkit==3.0.43 protobuf==3.20.2 psutil==5.9.8 ptyprocess==0.7.0 pure-eval==0.2.2 py4j==0.10.9.7 pyarrow==15.0.2 pyarrow-hotfix==0.6 pybind11==2.12.0 pyclipper==1.3.0.post5 pycodestyle==2.9.1 pycparser==2.21 pycryptodome==3.20.0 pydantic==1.10.11 pydantic_core==2.16.3 pydata-sphinx-theme==0.15.2 pydub==0.25.1 pyflakes==2.5.0 Pygments==2.17.2 pymilvus==2.4.0 PyMuPDF==1.20.2 pyparsing==3.1.1 pypdf==4.2.0 pypdfium2==4.29.0 pyphen==0.14.0 pytest==8.1.1 pytest-cov==5.0.0 pytest-timeout==2.3.1 pytest-xdist==3.5.0 python-dateutil==2.8.2 python-docx==1.1.0 python-dotenv==1.0.1 python-multipart==0.0.9 python-rapidjson==1.16 pytz==2024.1 PyYAML==6.0.1 rapidfuzz==3.7.0 rarfile==4.1 ray==2.5.1 readthedocs-sphinx-search==0.3.2 recommonmark==0.7.1 referencing==0.31.1 regex==2024.4.16 requests==2.31.0 rfc3339-validator==0.1.4 rich==13.7.0 rouge==1.0.1 rpds-py==0.18.0 ruff==0.2.2 sacremoses==0.1.1 safetensors==0.4.2 scikit-image==0.22.0 scikit-learn==1.4.1.post1 scipy==1.9.1 semantic-version==2.10.0 sentencepiece==0.2.0 sentry-sdk==1.44.0 seqeval==1.2.2 setproctitle==1.3.3 shapely==2.0.3 shellingham==1.5.4 simplejson==3.19.2 six==1.16.0 smart-open==6.4.0 smmap==5.0.1 sniffio==1.3.1 snowballstemmer==2.2.0 sortedcontainers==2.4.0 soundfile==0.12.1 soupsieve==2.5 soxr==0.3.7 spacy==3.7.4 spacy-legacy==3.0.12 spacy-loggers==1.0.5 Sphinx==7.2.6 sphinx-book-theme==1.1.2 sphinx-copybutton==0.5.2 sphinx-markdown-tables==0.0.17 sphinx-rtd-theme==2.0.0 sphinxcontrib-applehelp==1.0.8 sphinxcontrib-devhelp==1.0.6 sphinxcontrib-htmlhelp==2.0.5 sphinxcontrib-jquery==4.1 sphinxcontrib-jsmath==1.0.1 sphinxcontrib-qthelp==1.0.7 sphinxcontrib-serializinghtml==1.1.10 SQLAlchemy==1.4.52 SQLAlchemy-Utils==0.41.2 srsly==2.4.8 sseclient-py==1.7.2 stack-data==0.6.3 starlette==0.27.0 sympy==1.12 tenacity==8.2.3 tensorboard==2.16.2 tensorboard-data-server==0.7.2 tensorboardX==2.6.2.2 termcolor==2.4.0 thinc==8.2.3 threadpoolctl==3.3.0 tifffile==2024.2.12 tiktoken==0.6.0 tinycss2==1.2.1 tokenize-rt==5.2.0 tokenizers==0.15.2 tomli==2.0.1 tomlkit==0.12.0 tool-helpers==0.1.1 toolz==0.12.1 torch==2.2.2 tqdm==4.66.2 traitlets==5.14.1 transformers==4.39.3 triton==2.2.0 tritonclient==2.41.1 typer==0.9.0 types-beautifulsoup4==4.12.0.20240106 types-html5lib==1.1.11.20240228 types-PyYAML==6.0.12.12 types-requests==2.31.0.2 types-urllib3==1.26.25.14 typing-inspect==0.9.0 typing_extensions==4.5.0 tzdata==2024.1 ujson==5.9.0 Unidecode==1.3.8 urllib3==1.26.2 uvicorn==0.27.1 virtualenv==20.25.1 visualdl==2.5.3 wandb==0.16.5 wasabi==1.1.2 wcwidth==0.2.13 weasel==0.3.4 WeasyPrint==52.5 webencodings==0.5.1 websockets==11.0.3 Werkzeug==3.0.1 wget==3.2 wordcloud==1.8.2.2 xxhash==3.4.1 yacs==0.1.8 yapf==0.40.2 yarl==1.9.4 zhon==2.0.2 zipp==3.18.1 zope.event==5.0 zope.interface==6.3
好的 谢谢 我在试试吧 有的地方没文档没看懂
文档有了,在提PR,完善中,欢迎开发者贡献:
请提出你的问题
使用pipeline搭建的语义检索后 如何删除已上传的文档?有固定的接口吗?怎么使用?