Closed siuc-nate closed 2 weeks ago
@siuc-nate, you state with regards to the need for "cataloguing guidance/best practices" like the best practices with DataOne:
"May be somewhat redundant with the information in the handbooks (or it may replace some of that information)"
I don't agree. Nowhere does the handbook get down in the weeds with best practices. The Hand Book and a Best Practices Guide are complementary. Look at the 10 entries here: https://github.com/CredentialEngine/CatalogingGuidance. Check out Issue 6.
The Handbook should not be filled with such advice. The Best Practices Guide could/should be populated with advice in all areas of creating data and picking properties where discretion is exercised.
@siuc-nate, you state with regards to the need for a means to "comment" on each term table (could also be in the release history for terms in pending):
"Would need some basic anti-spam prevention and some kind of a display page"
As for the display page, there would be no need for any sort of public display. Comments that lead to issues can be raised in github issues by those monitoring incoming comments.
We currently have no simple, single, easily discovered, open mechanism for commenting on terms that aren't constraining for some (e.g., github) or easily found anywhere near the points in terms and other documentation where comments might arise. I've been around this project for a while, and I don't know where comments on terms should be registered except on Github. Perhaps I SHOULD know and don't...so consider me the canary in the coal mine that others haven't got a clue either.
@stuartasutton
I don't agree. Nowhere does the handbook get down in the weeds with best practices. The Hand Book and a Best Practices Guide are complementary. Look at the 10 entries here: https://github.com/CredentialEngine/CatalogingGuidance. Check out Issue 6.
Got it, updated the original post
As for #638 Enable inverse property declarations, the issue is not so much whether we can include the inverseOf property in terms of CTDL declarations, but rather how those declarations are handled by the Registry. Does the registry automatically create the inverse data when it encounters actual data for an inverseOf property; e.g.
Schema declaration: "husband" inverseOf "wife" Data in DB: "Shakespeare wife Hathaway" Query 1: "Who is Shakespeare's wife?" Query 2: "Who is Hathaway's husband?" (inverse)
Will we be able to do Query 2?
It does not, mostly because the data comes from many sources and our policy requires primary-source (directly or by proxy) information. Allowing automatic inverse connections would:
(credential)-[accreditedBy]->(some org)
would auto-generate a (some org)-[accredits]->(credential)
connection)We can still accommodate your queries though, since the search API enables crawling connections in reverse:
//Given this data
(person:Shakespeare)-[hasWife]->(person:Hathaway)
//Find Shakespeare's wife
//(literally: return all objects where the "hasWife" connection originates from "person:Shakespeare")
{
"^hasWife": {
"@id": "person:Shakespeare"
}
}
//Find Hathaway's husband
//(literally: return all objects where "hasWife" references "person:Hathaway")
{
"hasWife": {
"@id": "person:Hathaway"
}
}
Nate, but let's be clear, you are talking about Registry policy. Declaring inverse properties in CTDL has no such policy constraints.
As for the API solving the problem, I don't think your result does what the following intends:
{
"hasHusband": {
"@id": "person:Hathway"
}
}
In other words, there is no Hathaway hasHusband Shakespeare
triple directly added to the database when the triple `Shakespeare hasWife Hathaway' is added (as could be in a triplestore) or handled at the time of query. While the inverse can be inferred by humans from your result, there's nothing definitive. There is no policy constraint on Shakespeare asserting that Hathaway married him.
Nate, but let's be clear, you are talking about Registry policy. Declaring inverse properties in CTDL has no such policy constraints.
True, but your question was specifically about the registry:
Does the registry automatically create the inverse data when it encounters actual data for an inverseOf property; e.g. [...]
In other words, there is no Hathaway hasHusband Shakespeare triple directly added to the database (as could be in a triplestore) or handled at the time of query.
Correct, because none was asserted in the source/first-party data.
In someone else's implementation that doesn't care about that, they could turn on the automatic inverse calculations and have the generated hasHusband property; but again, you asked about the Registry. In any event, you're right that CTDL doesn't disallow inverse properties, but the schema manager doesn't currently have a means of supporting them (hence the bullet point in the original post above).
@siuc-nate, there is a difference between being able to handle inverse properties and having a policy that says "no". I think we are on the same page there. But there is a layer below that in terms of the Registry where we can do it but have a blanket policy (at the moment) that says we don't.
I think I am just reiterating what Stuart has said (and I maybe what Nate knows), but it's important to remember that declaring hasHusband
as an inverse of hasWife
does not mean that you have to add a complementary hasWife property everytime someone adds a hasHusband property, in fact it means that you do not have to add one (if you're willing to trust inferences).
Declaring inverse properties would embed the potential of reverse searches into CTDL. It's actually no different to other term-to-term or concept-to-concept relationships like rdfs:subPropertyOf rdfs:subClassOf owl:equivalentClass owl:equivalentProperty skos:exactMatch skos:narrower and so on: it's just another relationship that may be used when broadening searches to included results inferred from the schema rather than directly asserted in the data.
@stuartasutton Yes, I think we're saying the same thing.
@philbarker Whether the data is actually present, or appears to be present because it's inferenced, the result is the same (at least as far as the search API goes) - it will appear that there are inverse connections where none exist in the real data. That leads to a (QA Org)-[accredits]->(credential)
appearing to be there any time a credential (perhaps incorrectly) asserts (credential)-[accreditedBy]->(QA Org)
. We don't want false/unconfirmed inverse assertions to be part of the data set, inferenced or not.
@siuc-nate
@philbarker Whether the data is actually present, or appears to be present because it's inferenced, the result is the same (at least as far as the search API goes) - it will appear that there are inverse connections where none exist in the real data. That leads to a (QA Org)-[accredits]->(credential) appearing to be there any time a credential (perhaps incorrectly) asserts (credential)-[accreditedBy]->(QA Org).
That would only happen if you choose (or chose) to implement the search API that way. If it's not the behavior you want why would you choose to do it?
We wouldn't; that's the point I was making.
So I don't see the problem. It's a change to CTDL that won't affect the registry.
I think it was a mistake to merge the discussion of over a dozen different issues into one thread. Projects and tags would be a better way of keeping track of several issues that relate to the same component.
It's a change to CTDL that won't affect the registry.
Agreed. But Stuart asked me about the registry, which is why it came up.
I think it was a mistake to merge the discussion of over a dozen different issues into one thread.
I disagree. The majority of these had only one post in their respective threads and are small enough items that I don't see a problem aggregating them together. Projects and tags mean we'd still have a dozen extra issues scattered throughout our issue list. We can reopen a closed issue if it becomes significant enough.
Archiving this to reduce clutter, it will still be a very useful checklist of things for the new schema manager to be able to do, once I have time to resume working on it.
This is to consolidate the various other issues regarding limitations of the schema management system, and track any new ones that come up.
General:
Editor improvements
Infrastructure improvements
#749 Enable getting the schema serialization without the concept schemes/concepts
#763 Enable export of SHACL for the schema/policy
Support the use of
meta:TermStatusType
throughout the schema managerIssues affecting the schema directly:
#638 Enable inverse property declarations
#675 (Comment) Enable skos:broadMatch/skos:narrowMatch for concepts
#537 Enable better history/change tracking
#807 Enable skos:relatedMatch
Accommodate the terms/relationships to external schemas identified in this document
Enable annotating a borrowed term using skos:scopeNote
Content management enhancements:
#562 Enable public comments on term tables
#584 Create a means of cataloguing guidance/best practices using a schema such as Data One
Updates to the schema.org mapping for Embeddable Credentials (EOCreds):
#681 Update mapping for ceterms:identifier
#686 Update mapping for credential type
#687 Update mapping for
@type
#688 Update mapping for cost profile
Other Ideas