psychoinformatics-de / shacl-vue

https://psychoinformatics-de.github.io/shacl-vue/
MIT License
0 stars 0 forks source link

Find a general way to find graph nodes of a certain "type" #32

Open jsheunis opened 3 days ago

jsheunis commented 3 days ago

The context

In a form, there will be fields that need to present a list of existing "objects" that have been saved before (by the same user in the same browser session, or by other users), and that should allow a user to select a particular object from the list. Think of authors of a publication. If a given author on a publication already exists in the database, there is no need to enter the details again and the user can select the author from a list in order to add the author to the publication currently being entered.

The problem

https://github.com/psychoinformatics-de/shacl-vue/commit/043439ee01f8cc52f472937f5d6474fe5d0d3a6c introduces the use of grapois (from within the rdf-ext scope) for graph traversal. See https://github.com/psychoinformatics-de/shacl-vue/issues/30 for a list of functionality related to rdf-ext and grapois that I am currently aware of. Apart from known challenges with specifying the search terms correctly (which I'm hoping is purely because of my current unfamiliarity with the tool), there is also the question of which search terms to specify when trying to isolate a set of nodes of the same class. The current use case implemented in the InstancesSelectEditor component needs to find all instances in the graph database where their "type" (here, an abstract meaning) is the same as that specified by the sh:class field in the property shape being rendered by the InstancesSelectEditor. Some representative code:

const literalNodes = rdf.grapoi({ dataset: graphData })
   .hasOut(predicateSelector, rdf.literal(String(propClassCurie), XSD.anyURI))
   .quads();

Here, the predicateSelector is an argument to the traversal code, and this is because of the uncertainty of how to specify the "type". Without the bias that comes with how dlco defines its terms, my intuition was to say that the traversal code should look for all nodes with predicate rdf:type and object the same as the class of interest, e.g. finding all schema:Persons in a dataset will be:

const results = rdf.grapoi({ dataset: graphData})
  .hasOut(ns.rdf.type, ns.schema.Person)
  .quads(); 

But dlco has meta_type (see https://concepts.datalad.org/s/thing/unreleased/, and https://concepts.datalad.org/s/thing/unreleased/meta_type/) which will be specified in the data that needs to comply with a given dlco-based schema. If such data are contained in the graph dataset, it will for example be of the form:

https://example.org/ns/dataset/#ahill - https://concepts.datalad.org/s/thing/unreleased/meta_type - dldist:Person

and not

https://example.org/ns/dataset/#ahill - http://www.w3.org/1999/02/22-rdf-syntax-ns#type - dldist:Person

so the predicateSelector argument needs to be https://concepts.datalad.org/s/thing/unreleased/meta_type in that case, otherwise the relevant nodes (here https://example.org/ns/dataset/#ahill) won't be found and won't be listed in the InstancesSelectEditor component, which would make the component non-functional.

The question

So the question is, how do we know when to use the default rdf:type vs any other term as predicate? Is this something that needs to be specified on the app level, e.g. via configuration? Is it something that could somehow be determined from the SHACL shapes (node and property shapes) that are used to autogenerate the form fields (of which the InstancesSelectEditor is but one)? The goal is the make shacl-vue general enough so that it doesn't have to have context-specific implementations based on specific vocabularies/schemas.