zylon-ai / private-gpt

Interact with your documents using the power of GPT, 100% privately, no data leaks
https://privategpt.dev
Apache License 2.0
53.77k stars 7.22k forks source link

How to update pdf and remove pdf? #1124

Closed KarlTheforest closed 2 months ago

KarlTheforest commented 11 months ago

From the title, how do I remove the pdf? Where is it located? inside privateGPT directory?

pabloogc commented 11 months ago

We didn't have time to implement individual document deletion yet. As a workaround you can wipe your local_data folder.

I will keep the issue open but keep an eye on the releases, we will add this feature soon most likely.

However, if you feel like contributing to the project you can do so:

namp commented 11 months ago

Another problem is that if something goes wrong during a folder ingestion (scripts/ingest_folder.py), (for example if parsing of an individual document fails), then running ingest_folder.py again does not check for documents already processed and ingests everything again from the beginning (probabaly the already processed documents are inserted twice)

imartinez commented 11 months ago

Good point, that is also in the roadmap. Feel free to propose the improvement as a PR.

maozdemir commented 11 months ago

Another problem is that if something goes wrong during a folder ingestion (scripts/ingest_folder.py), (for example if parsing of an individual document fails), then running ingest_folder.py again does not check for documents already processed and ingests everything again from the beginning (probabaly the already processed documents are inserted twice)

The original implementation in the langchain is supposed to handle that for you, it'll only store if the source and stored vectors are not the same, but this leads to keeping out of date information afaik, and id not be surprised if that's handled too.

imartinez commented 11 months ago

@lopagela is working on this at the moment

lopagela commented 11 months ago

The PR have been merged: https://github.com/imartinez/privateGPT/pull/1163

lee-jian-hui commented 10 months ago

close this thanks