RDFLib / rdflib

RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information.
https://rdflib.readthedocs.org
BSD 3-Clause "New" or "Revised" License
2.17k stars 556 forks source link

Submit your DefinedNamespaces #1415

Closed nicholascar closed 2 years ago

nicholascar commented 3 years ago

As of rdflib 6.0.0, we have DefinedNamespaces which contain the elements of a namespace (ontology/vocabulary/profile etc) and make them available for type hinting and warning-based validation.

If people would like to create DefinedNamespaces for interesting/important/their favourite namespaces, we can include them in rdflib's rdflib.namespace package. This will make your namesapces more visible and less prone to error in use.

If you are interested, have a look at those already there and put a PR in to add yours.

dwinston commented 3 years ago

I would be interested in a DefinedNamespace for QUDT. @nicholascar you mentioned on the rdflib-dev list that you are on the Tech Advisory Board of QUDT, so I presume you would be interested in this inclusion as well. Are you planning to do this soon-ish anyway? I don't have much free time to start a PR at the moment, but I can see myself attempting one at some point. It is quite a large namespace / set of namespaces, but I would find its inclusion highly useful!

dwinston commented 3 years ago

Also, uhh...I'm seeing a default "If you're seeing this, you've successfully installed Tomcat. Congratulations!" page at qudt.org right now, so to be clear, I mean this: https://lov.linkeddata.es/dataset/lov/vocabs/qudt, or equivalently, the family of qudt*.ttl files I have stashed here.

image

nicholascar commented 3 years ago

Looks like QUDT is back now! I suspect they updated their server to the latest version of TopQuadrant's EDG and there was just a momentary boo boo while they did.

I would be interested in a DefinedNamespace for QUDT. @nicholascar you mentioned on the rdflib-dev list that you are on the Tech Advisory Board of QUDT, so I presume you would be interested in this inclusion as well.

Great, yes I and and yes I am! Silly I didn't think of that namespace before...

We have used a script to auto-process ontologies into DefinedNamespace but in this case there are multiple QUDT files so it might be easier just to SPARQL query them. I'll have a go.

westurner commented 3 years ago

I share the same concerns about inlining vocabs into rdflib core; even for autocompletion in an IDE.

nicholascar commented 3 years ago

Hi @westurner yes, there are concerns here, I don't mean to indicate there are not! We've been a bit lucky with easy - pretty static - namespaces to date. But it's really helpful to have the ability to have more out of the box in some form, even if we have to do more thinking than we've done so far to represent them well.

Agree with the versioning, not sure how to represent all the concerns perhaps. Some issues are dealt with I think:

So you won't be prevented from using terms not in an RDFlib namespace, just not assisted until someone updated the DefinedNamespace. I did this for PROV and XSD recently: the copies we had missed some terms. In both cases it was because those namespaces are compounded from multiple sources, i.e. not just one ontology/vocabulary.

westurner commented 3 years ago

How should versions be represented? IMHO, now would be the best time to:

While we often hope that vocab changes are only additive, there are non-additive vocab modifications that will result in mismatch between the DefinedNamespace and the vocab. Which - as you point out - only really affects IDEs.

It may be infeasible to also do e.g. SHACL data validation with these DefinedNamespaces? Is there functional overlap where the will already be some maintenance burden?

If google/schemarama (mentioned in the linked gdoc) does not support dynamic forms-based validations, what would it take to support such an essential data data quality use case?

In regards to interactive data validation as a primary use case, What are these for: https://github.com/google/schema-dts

westurner commented 3 years ago

I did this for PROV and XSD recently: the copies we had missed some terms. In both cases it was because those namespaces are compounded from multiple sources, i.e. not just one ontology/vocabulary.

Is there already a way to e.g. generate the DefinedNamespace of each version of a vocab and diff?

white-gecko commented 3 years ago

As already pointed out on ths list I also have some doubt about including a set of defined namespaces directly into the RDFlib. Vocabularies and their terms evolve over time. This would bring in problems when we include the vocabularies as they are in the rdflib releases.

We have used a script to auto-process ontologies into DefinedNamespace (@nicholascar )

Do you have this script or can you point me to it?

nicholascar commented 3 years ago

@edmondchuc can you please point to the script you used to make the recent DefinedNamespace instances?

ajnelson-nist commented 3 years ago

On how to represent vocabulary versions - would it be reasonable to require a DefinedNamespace import only owl:Ontology definitions that designate a owl:versionIRI?

westurner commented 3 years ago

How many of e.g. these do not (yet) depend upon OWL and so thus do not yet specify the version within the {RDFS vocabulary, or OWL ontology}?

Shouldn't vocabs (with e.g. a SemVer, CalVer versions) be signed with a DID and ld-signatures?

On Tue, Sep 28, 2021, 16:46 Alex Nelson @.***> wrote:

On how to represent vocabulary versions - would it be reasonable to require a DefinedNamespace import only owl:Ontology definitions that designate a owl:versionIRI?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/RDFLib/rdflib/issues/1415#issuecomment-929611997, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMNS7UEK52CD2PFOJ6IALUEISQTANCNFSM5EIR5NSQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

nicholascar commented 3 years ago

On how to represent vocabulary versions - would it be reasonable to require a DefinedNamespace import only owl:Ontology definitions that designate a owl:versionIRI?

I don't think we could do that. We shouldn't force rdflib users to commit to an OWL property, given that rdflib is RDF general purpose. Users might have a namespace that would be a fine DefinedNamespace but doesn't use any OWL.

nicholascar commented 3 years ago

The DefinedNamespace generator script that was used for rdflib 6.0.0 is here: https://github.com/hsolbrig/definednamespace/

We could develop a DefinedNamespace repository with GitHub Actions or similar to processor inputs - vocabs and ontologies - into DefinedNamespaces.

Following on from my previous comment: I think we should impose essentially no requirements on users of a repo that automatically makes DefinedNamespaces, however we could certainly encourage owl:versionIRI and other ontology annotation properties.

ajnelson-nist commented 3 years ago

Thanks for the discussion on owl:versionIRI. I agree with not imposing it as a requirement, and retract the suggestion.

However, looking at this from the other direction - if a namespace does provide an owl:versionIRI, should guidance to anyone creating a DefinedNamespace be to use the versionIRI instead of the owl:ontologyIRI?