blazegraph / database

Blazegraph High Performance Graph Database
GNU General Public License v2.0
872 stars 170 forks source link

Full text search using approximate string matching #184

Open fbuciuni90 opened 3 years ago

fbuciuni90 commented 3 years ago

I'm using Blazegraph 2.1.5 and I'd like to perform full text search using http://www.bigdata.com/rdf/search# service.

Considering the following triple

<http://example.org/data/1> rdfs:label  "Available" .

I do the query

prefix bds: <http://www.bigdata.com/rdf/search#>
select ?s ?p ?o
where {
?o bds:search "avalable" .
?s ?p ?o .
}

expecting a result containing the triple. However, I get no results. Basically, I would like to enable something like fuzzy string searching. I guess it could be related to the analyzer configuration, but I'm not able to figure it properly. Do you have any suggestions?

Thanks a lot.

beebs-systap commented 3 years ago

Can you post your namespace configuration? You need to make sure com.bigdata.rdf.store.AbstractTripleStore.textIndex=true for text indexing to be configured.

https://github.com/blazegraph/database/wiki/FullTextSearch

From: Francesco Buciuni notifications@github.com Reply-To: blazegraph/database reply@reply.github.com Date: Tuesday, November 10, 2020 at 1:06 AM To: blazegraph/database database@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: [blazegraph/database] Full text search using approximate string matching (#184)

I'm using Blazegraph 2.1.5 and I'd like to perform full text search using http://www.bigdata.com/rdf/search#http://www.bigdata.com/rdf/search service.

Considering the following triple

http://example.org/data/1 rdfs:label "Available" .

I do the query

prefix bds: http://www.bigdata.com/rdf/search#

select ?s ?p ?o

where {

?o bds:search "avalable" .

?s ?p ?o .

}

expecting a result containing the triple. However, I get no results. Basically, I would like to enable something like fuzzy string searching. I guess it could be related to the analyzer configuration, but I'm not able to figure it properly. Do you have any suggestions?

Thanks a lot.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/blazegraph/database/issues/184, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACB3DB4SDYVZ6S47LKIRF43SPD62TANCNFSM4TQMJXWA.

fbuciuni90 commented 3 years ago

My namespace configuration is the following:

com.bigdata.rdf.store.AbstractTripleStore.textIndex=true
com.bigdata.rdf.store.AbstractTripleStore.axiomsClass=com.bigdata.rdf.axioms.NoAxioms
com.bigdata.rdf.sail.isolatableIndices=true
com.bigdata.rdf.sail.truthMaintenance=false
com.bigdata.rdf.store.AbstractTripleStore.justify=false
com.bigdata.rdf.sail.namespace=fst-test
com.bigdata.rdf.store.AbstractTripleStore.quads=true
com.bigdata.namespace.fst-test.lex.com.bigdata.btree.BTree.branchingFactor=400
com.bigdata.journal.Journal.groupCommit=false
com.bigdata.namespace.fst-test.spo.com.bigdata.btree.BTree.branchingFactor=1024
com.bigdata.rdf.store.AbstractTripleStore.geoSpatial=true
com.bigdata.rdf.store.AbstractTripleStore.statementIdentifiers=false

I also tried this configuration:

com.bigdata.namespace.fst-test.lex.com.bigdata.btree.BTree.branchingFactor=400
com.bigdata.rdf.store.AbstractTripleStore.textIndex=true
com.bigdata.namespace.fst-test.spo.com.bigdata.btree.BTree.branchingFactor=1024
com.bigdata.rdf.store.AbstractTripleStore.axiomsClass=com.bigdata.rdf.axioms.NoAxioms
com.bigdata.rdf.sail.isolatableIndices=true
com.bigdata.rdf.sail.truthMaintenance=false
com.bigdata.rdf.store.AbstractTripleStore.justify=false
com.bigdata.rdf.sail.namespace=fst-test
com.bigdata.rdf.store.AbstractTripleStore.quads=true
com.bigdata.journal.Journal.groupCommit=false
com.bigdata.rdf.store.AbstractTripleStore.geoSpatial=true
com.bigdata.rdf.store.AbstractTripleStore.statementIdentifiers=false
com.bigdata.search.FullTextIndex.analyzerFactoryClass=com.bigdata.search.ConfigurableAnalyzerFactory
com.bigdata.search.ConfigurableAnalyzerFactory.analyzers._.analyzer=org.apache.lucene.analysis.miscellaneous.PatternAnalyzer
com.bigdata.search.ConfigurableAnalyzerFactory.analyzers._.pattern="."
com.bigdata.search.ConfigurableAnalyzerFactory.analyzers._.stopwords=none

but it looks like having the same effect of the first one.

Thanks.