Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)
I seem to have issues with the latest OSS of 04.2024, it doesnt seem to do real full text search.
My use case: index PDF documents with OCR text layer into OSS.
So Tika extracts the content and the ETL python program sends it to Solr into the Document field content_txt.
Sadly, searching for some partial content in content_txt doesnt return the document, even if literal match.
Searching for main document attributes (title) works.
Is it possible to modify the Solr request so that I can find also by any string match?
Right now by default, the request seems to just be the bare content of $_GET['q'] the q value of the searchfield.
Hi guys,
I seem to have issues with the latest OSS of 04.2024, it doesnt seem to do real full text search.
My use case: index PDF documents with OCR text layer into OSS. So Tika extracts the content and the ETL python program sends it to Solr into the Document field content_txt.
Sadly, searching for some partial content in content_txt doesnt return the document, even if literal match. Searching for main document attributes (title) works.
Is it possible to modify the Solr request so that I can find also by any string match? Right now by default, the request seems to just be the bare content of $_GET['q'] the q value of the searchfield.
Thanks for you help!