chenejac / VIVOTestMigration

0 stars 0 forks source link

VIVO-1423: Consider ElasticSearch as a means to improve search and faceted search #1311

Open chenejac opened 6 years ago

chenejac commented 6 years ago

Mike Conlon (Migrated from VIVO-1423) said:

Based on the work of Colorado, consider how/if ElasticSearch can be used to improve search in Vitro and VIVO.

chenejac commented 6 years ago

Don Elsborg said:

Notes provided by Jim Blake:

There's a collection of interfaces (and two exception classes) here:

https://github.com/vivo-project/Vitro/tree/develop/api/src/main/java/edu/cornell/mannlib/vitro/webapp/modules/searchEngine

The Solr implementation is here: https://github.com/vivo-project/Vitro/tree/develop/api/src/main/java/edu/cornell/mannlib/vitro/webapp/searchengine/solr

It relies heavily on the "base" implementation, and perhaps the Elastic implementation could rely on it also. The "base" is here: https://github.com/vivo-project/Vitro/tree/develop/api/src/main/java/edu/cornell/mannlib/vitro/webapp/searchengine/base

chenejac commented 6 years ago

Don Elsborg said:

The desire is to create a document that is useful for front end developers.

At Colorado we use:

It's a nest document that should contain most of the information a developer needs to consume for summary information. New work would be to use json-ld and semantic keys.

So doi would be BIBO:doi and so on. For complex relationships that vivo-isf represents I suggest we use schema.org. Hence author would become schema:author

chenejac commented 6 years ago

Don Elsborg said:

VIVO has a config file already that maps sparql query results to index fields, see:

So, here's an example from the config file:

* When indexing a grant, add administrating org.
#
:vivodocumentModifier_addAdminToGrant
 a <java:edu.cornell.mannlib.vitro.webapp.searchindex.documentBuilding.SelectQueryDocumentModifier> ,
 <java:edu.cornell.mannlib.vitro.webapp.searchindex.documentBuilding.DocumentModifier> ;
 rdfs:label "Add Administrator to Grant" ;
 :hasTypeRestriction "http://vivoweb.org/ontology/core#Grant" ;
 :hasTypeRestriction "http://vivoweb.org/ontology/core#Contract" ;
 :hasTypeRestriction "http://scholars.cornell.edu/ontology/ospcu.owl#CooperativeAgreement" ;
 :hasTargetField "administrator_txt" ;
 :hasTargetField "administrator_ss" ;
 :hasSelectQuery """
 PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
 PREFIX foaf: <http://xmlns.com/foaf/0.1/>
 PREFIX vivo: <http://vivoweb.org/ontology/core#>
 SELECT DISTINCT ?admLabel
 WHERE
 \{
 ?uri vivo:relates ?adm .
 ?adm a foaf:Organization .
 ?adm rdfs:label ?admLabel .
 }
 """ .

 

Hence a middleware mechanism needs to be designed that maps the field to the appropriate document structure. This can be a challenge since elastic is hierarchical and SOLR isn't.