archesproject / arches

Arches is a web platform for creating, managing, & visualizing geospatial data. Arches was inspired by the needs of the Cultural Heritage community, particularly the widespread need of organizations to build & manage cultural heritage inventories
GNU Affero General Public License v3.0
212 stars 143 forks source link

Access to inverse relationship values #5290

Closed azaroth42 closed 4 years ago

azaroth42 commented 5 years ago

Background

When dividing up the overall knowledge into resource models and then linking between instances of those models, there is always a design choice as to when to split and how to link. In particular, one choice is whether to link from A to B, from B to A, or try to keep both directions synchronized.

For example, a Group of people and the individual people within the group are separate resource instances of Group and Person models respectively. They are related by either has_member or is_member_of relationships. Thus the person can have a is_member_of to the group, or the group can have a has_member to the person, or both.

There are various challenges that this design choice results in. If there is only a single direction materialized in the data (e.g. is_member_of from person to group), then looking at the range resource (the group) will not reveal who its members are. However without additional functions to manage the synchronization of the properties, materializing both is a lot of overhead on editors entering data and is very likely to get out of sync.

Alternatives

There are thus two possible solutions:

  1. Functions to manage setting and unsetting properties automatically in a different model, based on the data entered. This would involve a good deal of work (it seems to me) in terms of configuring which node in the range model should be manipulated both for configuration of the models, and initially for implementing a UI to do that. Even if this was implemented, for long lists (a Page instance being part_of a Book) this would likely overwhelm the report for the range instance.
  2. A separate JSON-LD API and human-intended UI that automatically generates all of the inverse relationships, based on configuring them in the ontology rather than per model. is_member_of is always the inverse of has_member, so there's no need to have to assert this within the resource model. This also solves the long list issue, in that the list of pages of a book (or members in a group) would be not part of the basic UI, but instead could be requested. The challenge that it does introduce is a lack of the card to handle the report, but I think this can be mitigated, as these will always be resource-instance data types. Perhaps the outbound card could be extended to have an inverse card label that would be used.

Preferred Solution

As above, I prefer the second option above, where a separate call automatically retrieves the inbound links.

The call could be handled by appending /inverses to the end of the report or json-ld URL: http://arches.org/resources/uuid-of-instance/inverses would then list all of the inverses.

This would involve having a version of the ontology that manages inverseOf, such as the linked.art version of CIDOC-CRM. The list of inverses for 6.2 is also easy to generate, and can be provided. Those inverses would need to be added to the ontology management in postgres.

Then elastic might need to be updated slightly to record the inbound relationships. Plus the code to then interact with elastic to retrieve the inbound relationships for a given resource to be handed off to either a report or the json-ld API.

apeters commented 5 years ago

@azaroth42 Thanks for the very detailed write up, but I feel like we've jumped right to a solution without defining the use case first. I think we need to have a very clear definition of the use case(s) first before we can implement a solution.

Just for complete clarity, as far as I know, this is in relation to the resource-instance datatype. The problem currently with the resource-instance datatype as I understand it is that it's unidirectional.

For example, If I have a Page model with a resource instance datatype that points to a Book model, when I look at a report for that Book instance I can't currently see that a Page instance points to it.

That's the problem we want to solve, correct? That in reports and presumably elsewhere (search, exports, etc..) we want to see the other resources that point to the resource we're interested in. Additionally we want to do this with the correct semantics. That is when we view our Book instance we want to see is_composed_of references to it's Pages.

Additionally, we don't want to have to manually enter the inverse relationships as they can all be inferred.

Finally, is the resource-instance datatype really at issue here? All the examples I've seen seem to relate WHOLE instances to each other (as if relating the root node of each instance to each other).

azaroth42 commented 5 years ago

[...]

That is the problem to solve, yes.

Two responses for the datatype question...

  1. The resource-instance datatype is the method for linking resources, with ontology managed relationships. The "related resource" pattern doesn't have a relationship type to have an inverse.
  2. There are also use cases for resource-instance in branches, for example an Object is_produced_by a Production that is carried_out_by an Actor -- we would want to see all the activities that the actor has carried out. Equally, a Person is identified_by a Name, which is referred_to_by a Linguistic Object, where the Linguistic Object is a resource model instance that represents a particular textual work.

HTH

apeters commented 5 years ago

So, just to clarify:

The resource-instance datatype is the method for linking resources, with ontology managed relationships. The "related resource" pattern doesn't have a relationship type to have an inverse.

Our "related resource" pattern actually does employ ontology derived relationship types

Screen Shot 2019-09-19 at 3 53 09 PM
mradamcox commented 5 years ago

This is something I proposed a long time ago: https://github.com/archesproject/arches/issues/3154

I have thoughts on this, as I am working on handling it in my own way for a current project, but we can't have two tickets open discussing the same thing. I'll leave it up to you all to close one, etc. and will comment when that is handled.

apeters commented 5 years ago

closing #3154 in favor of this ticket

azaroth42 commented 5 years ago

Oh, apologies, indeed it does have an ontology reflection! I think a consolidation of resource-instance and related resource is the topic of another ticket, but it would also be valuable to consider the effects here (and generally for import/export).

apeters commented 5 years ago

I think the solution here would necessitate the consolidation of the resource-instance node functionality and related resources functionality.

mradamcox commented 5 years ago

Ok, so the direction I am currently going on this front is much less intense, even than what I proposed in that last ticket.

my current use case: I need these "inverse relationships" to be reflected in reports. In other words: I have a resource model for "Scout Report" which has a resource-instance node that links the report to an instance of the "Archaeological Site" resource model. But, in the Archaeological Site resource report, I want to see all of the Scout Reports that reference that site. (There is nothing in the Archaeological Site resource model itself that has anything to do with the Scout Report resource model).

Very simply: I'm adding a section in settings.py called REPORT_INLINES (a reference to the way Django handles this stuff in the admin interface). Defining inlines for a resource looks approximately (this is in development) like this:

REPORT_INLINES = {
    "Archaeological Site" : [
        {
            "title":"Scout Reports",
            "inline_model":"Scout Report",
            "node_to_look_in":"FMSF Site ID"
        }
    ]
}

You can probably figure out from my summary above, but here we have a resource model "Archaeological Site" for which a list of "inlines" can be defined. There is one defined, which has a title (to use in the report), a resource model, and a node name. Some logic in the report view is added which furnishes a list of resources to the report itself.

It would certainly be more slick to have this inline definition in a function that is attached to the Archaeological Site resource model; I'm just starting simple with settings.py for now.

@azaroth42 I like your idea of an inverse view... I essentially have made that but in a more baked-in, use-case specific way. I'll keep it in mind as I move forward.

@apeters I agree with you that consolidating the resource instance node and resource to resource relationships should happen as well. In my opinion, the first step is to make an effort to create better visualization options for resource-instance nodes (because, when it comes down to it, the node-style "graph" visualization is the only thing one really gets out of related resources anyway; there's no other use for them). This view that Rob describes would be a good way to pull the information needed to create the d3 graph and apply that to resource-instance nodes, for example.

chiatt commented 4 years ago

@azaroth42 Is this still relevant now that the new resource x resource UI is completed?

azaroth42 commented 4 years ago

I think we could close this issue and make a more specific one about how to get access to the inverses via JSON-LD, but I think only the human UI (for setting and viewing) is done right now.

azaroth42 commented 4 years ago

Closing, fixed, and will make the JSON-LD issue