opensemanticsearch / open-semantic-search

Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)
https://opensemanticsearch.org
GNU General Public License v3.0
968 stars 169 forks source link

Procedure to add a regex and facet #339

Open LoZio opened 3 years ago

LoZio commented 3 years ago

Hi, I'm trying to add regex parsing and facet but found no documentation on how to do it. I just found #248 and tried to infer something from that, but I can't get it working. First of all I created my own tsv regex file, put it into regex\myregex.tsv and referenced the file in etl-custom. I created a simple credit card regex [0-9]{4}\-?[0-9]{4}\-?[0-9]{4}\-?[0-9]{4} and a file that matches in any possible way. I supposed it was enough since this is the only part I found documented. But the issue above says he needed to create facets and groups using django interface. I created a new facet that is identical to iban_ss, called credicard_ss. The facets file was modified with my facet config['facets']['creditcard_ss'] = {'label': 'Credit card', 'uri': '', 'facet_limit': '20', 'snippets_limit': '10',} I restarted the service and indexed a doc that contains credit cards. The verbose log contains: Checking regex [0-9]{4}\-?[0-9]{4}\-?[0-9]{4}\-?[0-9]{4} creditcard_ss for facet tag_ss so I'm sure my regex is getting picked up, BUT: 1) the creditcard_ss is not found/tied to the document (it is seen as a phone number) 2) the facet selector does not appear at all What am I missing here? Is there a complete document with the procedure? Also, #248 says something about groups but any tentative to create a group leads to an error

AttributeError at /admin/thesaurus/group/add/
'Group' object has no attribute 'title'

I need to create dozens of regexes... Using deb install on ubuntu 20.04. Thank you

schneipk commented 3 years ago

I've got a similar (at least how it manifests) problem using the "group manager" in the django admin interface. Trying to add a group results in the same exception

AttributeError at /admin/thesaurus/group/add/

'Group' object has no attribute 'title'

urosch commented 3 years ago

I've got a similar (at least how it manifests) problem using the "group manager" in the django admin interface. Trying to add a group results in the same exception

AttributeError at /admin/thesaurus/group/add/

'Group' object has no attribute 'title'

Same problem here, is there any solution os far?

chrbratt commented 3 years ago

Same here. AttributeError 'Group' object has no attribute 'title'

image

urosch commented 2 years ago

Still no development? The same problem persists... Is there any solutions, does anyone know what is the cause of this error?