opensemanticsearch / open-semantic-search

Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)
https://opensemanticsearch.org
GNU General Public License v3.0
962 stars 169 forks source link

Working example for importing and applying an ontology #131

Open user121216 opened 5 years ago

user121216 commented 5 years ago

Hello,

I'm trying to load a small ontology and to apply that on a prepared text file, but it's not working. The text file is indexed and searchable by fulltext but there is no interactive facet.

Can someone help with a working example to understand what is possible with the current implementation, e.g. pizza with a title and some credentials.

Thanks in advance!

PS: I want to annotate some html files to make them searchable by facets, e.g. <h1> tag with a defined owl class. Later it should be possible to search by text, e.g. pizzas with onions. :)

YoannMR commented 5 years ago

Hello,

I'd be curious to know if you are able to make it work with an ontology.

We tried using ontologies but had to give up (it is looking for only a subset of synonyms keywords and the hierarchy is somehow flattened).

Instead, we created list of keywords from an ontology and imported the text file to tag documents. If you care about synonyms, that can be done using a text file as well with some minor modification of OSS's code.

Let me know how it goes for you. Thanks

user121216 commented 5 years ago

So, it's not possible to add custom ontology classes like the facet ones (Person, Organizations, Content type), yet?

I tried to add a list of keywords but my tag list for tagging is still empty. It would be nice to get another hint.

Thanks so far.

YoannMR commented 5 years ago

Here is what worked for me on a Ubuntu 18.04 EC2 instance:

Adding an ontology that way should automatically tag your documents and you should now see a facet named "greek_letter.txt" with the tagged keyword appearing there. You can also add snippet to see tagged word under each documents (see screenshot below).

If your documents are not tagged, you could try reindexing them with command line (-f to force reindexing, -v for verbose to see what's going on) opensemanticsearch-index-file /test_folder -f -v

image

I hope that helps!

user121216 commented 5 years ago

Thank you! It works.

I made a fundamental mistake. :) I added the keywords to html tags instead of readable text.

Now I will play around a little bit.

If somebody can help me with real ontologies, it would be nice. Otherwise the issue can be closed.