Open chenejac opened 6 years ago
Don Elsborg said:
Notes provided by Jim Blake:
There's a collection of interfaces (and two exception classes) here:
The Solr implementation is here: https://github.com/vivo-project/Vitro/tree/develop/api/src/main/java/edu/cornell/mannlib/vitro/webapp/searchengine/solr
It relies heavily on the "base" implementation, and perhaps the Elastic implementation could rely on it also. The "base" is here: https://github.com/vivo-project/Vitro/tree/develop/api/src/main/java/edu/cornell/mannlib/vitro/webapp/searchengine/base
Don Elsborg said:
The desire is to create a document that is useful for front end developers.
At Colorado we use:
_index: "webex-rc1",
_type: "publication",
_id: "27040",
_score: 1,
_source: { * publicationYear: "2012",
doi: "10.1111/j.1467-9361.2012.00673.x",
name: "Climate Change and Roads: A Dynamic Stressor-Response Model",
amscore: 3,
authors: [ { orcid: "0000-0003-2786-6675",
organization: [ { uri: "[https://experts.colorado.edu/individual/deptid_10036]",
name: "Undergraduate Education"},
{ * uri: "[https://experts.colorado.edu/individual/deptid_10331]",
name: "Civil, Environmental and Architectural Engineering"}],
researchArea: [ { uri: "[https://experts.colorado.edu/individual/spinId_1001008]",
name: "Climate Change"},
{ * uri: "[https://experts.colorado.edu/individual/spinId_0608005]",
name: "Engineering Project Management"},
{ * uri: "[https://experts.colorado.edu/individual/spinId_0606000]",
name: "Civil Engineering"}],
uri: "[https://experts.colorado.edu/individual/fisid_125496]",
name: "Chinowsky, Paul"}],
mostSpecificType: "Journal Article",
pubId: "27040",
publishedIn: { * uri: "[https://experts.colorado.edu/individual/journal_185432]",
name: "Review of Development Economics"},
uri: "[https://experts.colorado.edu/individual/pubid_27040]",
publicationDate: "2012-08-01"}
It's a nest document that should contain most of the information a developer needs to consume for summary information. New work would be to use json-ld and semantic keys.
So doi would be BIBO:doi and so on. For complex relationships that vivo-isf represents I suggest we use schema.org. Hence author would become schema:author
Don Elsborg said:
VIVO has a config file already that maps sparql query results to index fields, see:
So, here's an example from the config file:
* When indexing a grant, add administrating org.
#
:vivodocumentModifier_addAdminToGrant
a <java:edu.cornell.mannlib.vitro.webapp.searchindex.documentBuilding.SelectQueryDocumentModifier> ,
<java:edu.cornell.mannlib.vitro.webapp.searchindex.documentBuilding.DocumentModifier> ;
rdfs:label "Add Administrator to Grant" ;
:hasTypeRestriction "http://vivoweb.org/ontology/core#Grant" ;
:hasTypeRestriction "http://vivoweb.org/ontology/core#Contract" ;
:hasTypeRestriction "http://scholars.cornell.edu/ontology/ospcu.owl#CooperativeAgreement" ;
:hasTargetField "administrator_txt" ;
:hasTargetField "administrator_ss" ;
:hasSelectQuery """
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX vivo: <http://vivoweb.org/ontology/core#>
SELECT DISTINCT ?admLabel
WHERE
\{
?uri vivo:relates ?adm .
?adm a foaf:Organization .
?adm rdfs:label ?admLabel .
}
""" .
Hence a middleware mechanism needs to be designed that maps the field to the appropriate document structure. This can be a challenge since elastic is hierarchical and SOLR isn't.
This issue addresses #732 This issue is related to #749
Mike Conlon (Migrated from VIVO-1423) said:
Based on the work of Colorado, consider how/if ElasticSearch can be used to improve search in Vitro and VIVO.