opensemanticsearch / open-semantic-search

Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)
https://opensemanticsearch.org
GNU General Public License v3.0
979 stars 169 forks source link

Bug in indexing after file deletion #270

Open EvgeniaPatsoni opened 4 years ago

EvgeniaPatsoni commented 4 years ago

I am using Open Semantic Desktop Search and I have mounted a remote directory that is indexed correctly. I am facing an issue though each time I delete a file on the remote directory, because it keeps showing up in the index, even after re-indexing/restarting the vm. Of course, If I click on the file I get a Not Found error. How can I resolve this issue?

Thanks

mosea3 commented 4 years ago

Hi Evgenia,

I back you up on this. +1 I use a much earlier version of OSS and i have the same behaviour. OSS does additive reindexation only. Not completely differential.

As im not into Python, but Webdevelopment, I implemented a delete button (http API call as documented) to delete the entry.

To consequently have an up to date index you can workaround like this: Schedule to empty the index, index completely.

Best regards

Andy On 25 Feb 2020, at 15:16, Evgenia Patsoni notifications@github.com wrote:



I am using Open Semantic Desktop Search and I have mounted a remote directory that is indexed correctly. I am facing an issue though each time I delete a file on the remote directory, because it keeps showing up in the index, even after re-indexing/restarting the vm. Of course, If I click on the file I get a Not Found error. How can I resolve this issue?

Thanks

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/opensemanticsearch/open-semantic-search/issues/270?email_source=notifications&email_token=AGHTGF6SMB4MQEQZ77LC3ZDREUR2BA5CNFSM4K3KHWE2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IQCVYYA, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AGHTGF6PCPSVEZIOQMCSQM3REUR2BANCNFSM4K3KHWEQ.

rusty9283 commented 4 years ago

Hello, I have the same problem - i mentiod this in this thread: https://github.com/opensemanticsearch/open-semantic-search/issues/197

But I can't clear index and reindex all because we have thousands of files ...

Thanks!

danielrosero commented 2 years ago

Hello, I have the same problem - i mentiod this in this thread: #197

But I can't clear index and reindex all because we have thousands of files ...

Thanks!

Hi @rusty9283 did you manage to solve this issue? Rergards!