RDA-DMP-Common / hackathon-2020

RDA hackathon on maDMPs
The Unlicense
6 stars 6 forks source link

New version of the DMP Common Standard Ontology #5

Open JoaoMFCardoso opened 4 years ago

JoaoMFCardoso commented 4 years ago

As part of the ongoing effort to have different formats to represent the DMP Common Standard, I'm looking for help in creating a new version of the DCSO.

There are four main points of action:

1. Use some means (SHACL, ShEx or some other option) to represent the constraints in the DMP Common Standard.

2. Integrate the DCAT and DublinCore ontologies into the existing DCSO, thus reusing classes (and properties) as opposed to the current practice of redefining classes.

3. How to represent the custom controlled vocabularies required for some of the existing fields, in a way that they allow for validation (i.e., the usage of iso-3166-1-alpha2 in the geo_location property in the Host class).

4. Provide the DCSO with a purl, thus solving the current namespace issue. Which is unsuitable for long term preservation and reuse.

Edit: I'm currently solving issue 4. Following the advice of robertgiessmann. Thanks!

Edit2: Issue 4 solved. https://w3id.org/dcso Thanks.

ljgarcia commented 4 years ago

@JoaoMFCardoso is SHACL the only option? Have you considered ShEx?

fekaputra commented 4 years ago

Hi, I will be interested to work on the ontology and shacl constraints

JoaoMFCardoso commented 4 years ago

Hi, I will be interested to work on the ontology and shacl constraints

Have you registered in the hachathon teams? Look for the "unaturals".

Any help is more than welcome.

JoaoMFCardoso commented 4 years ago

@JoaoMFCardoso is SHACL the only option? Have you considered ShEx?

Hi. I don't have any experience with either of the two. So I'm open for suggestions, and help.

I'll be looking at both of then and learning as much as I can before the event.

Anyway. If you're interested in helping out the "unaturals" team is more than glad to welcome you.

https://docs.google.com/spreadsheets/d/12bwx0KbY8BAIh24sJQgl1Grhu_oxkMty6ZvsQQxSZ2k/edit#gid=0

ljgarcia commented 4 years ago

Hi, I just signed up on the spreadsheet. I have work more with ShEx than SHACL. A useful tool, for both of them but maybe with some more support for ShEX, would be http://rdfshape.weso.es/. Other tools for ShEx only would be http://rdfshape.weso.es/, http://shex.io/webapps/shex.js/doc/shex-simple.html and https://tools.wmflabs.org/shex-simple/wikidata/packages/shex-webapp/doc/shex-simple.html (the last two seem to be the same, the very last one is the one integrated with Wikidata Shapes, see for instance the link "check entities against this Schema" on https://www.wikidata.org/wiki/EntitySchema:E37)

ghost commented 4 years ago

Hello everyone, from what I learned lately, SHACL is the more current solution and favored when building new tools. I strongly support all of the items; item 4. is solved in 10 minutes with purl.org or w3id.org, I am happy to help there!

Regarding item 3: highly interesting, do you have any ideas on that already? I put forward the dct:conformsTo property (https://www.dublincore.org/specifications/dublin-core/dcmi-terms/#http://purl.org/dc/terms/conformsTo), but I don't know of any validators. I would be the happiest person on earth, if you know of one or we would start building one, though!

ghost commented 4 years ago

P.S.: Allow me a bit of pessimism here: The goals are great, but this is too much for this hackathon. Let us focus on implementing the most important fraction of what is needed, with the aim to fulfill these goals in future hackathons. What do you think?

JoaoMFCardoso commented 4 years ago

P.S.: Allow me a bit of pessimism here: The goals are great, but this is too much for this hackathon. Let us focus on implementing the most important fraction of what is needed, with the aim to fulfill these goals in future hackathons. What do you think?

Well let me just start off saying that I agree with you. Some of these goals are in fact difficult to achieve, in particular in a hackathon. To be completely honest here, these goals are basically my wishlist for the DCSO :) So if I end up getting help to achieve some of them, I'll be over the moon.

Answering your first post. The dtc:conformsTo property seems a great solution for some cases. But there are some vocabularies in use in the DMP CSM that do not comply with any established standard (as far as I know, I'm prone to be wrong a lot :)).

For example in the ethical_issues_exist data property, the vocabulary to conform is (yes, no, unknown). So we'd still have to describe the "possible values" somewhere else.

Anyway I'm just glad more than one person is actually worrying about this issue :) Thank you all

JoaoMFCardoso commented 4 years ago

Hi, I just signed up on the spreadsheet. I have work more with ShEx than SHACL. A useful tool, for both of them but maybe with some more support for ShEX, would be http://rdfshape.weso.es/. Other tools for ShEx only would be http://rdfshape.weso.es/, http://shex.io/webapps/shex.js/doc/shex-simple.html and https://tools.wmflabs.org/shex-simple/wikidata/packages/shex-webapp/doc/shex-simple.html (the last two seem to be the same, the very last one is the one integrated with Wikidata Shapes, see for instance the link "check entities against this Schema" on https://www.wikidata.org/wiki/EntitySchema:E37)

Thank you for registering :) Again, in the SHACL vs ShEx issue. I was advised to use SHACL. But as I stated above, I have no experience using either of the two. I was hoping to get some help from someone who is up to date in what the state of the art is on that department.

paulwalk commented 4 years ago

I would recommend ShEx over SHACL. Both are supported by W3C but interest in and support for ShEx is steadily growing. Certainly in the DCMI community, ShEx is the one that people are talking about and building support for.

fekaputra commented 4 years ago

Hi, I would suggest SHACL over ShEx due to two pragmatic reasons: (i) the W3C recommendation status of SHACL vs W3C community group of ShEx; and (ii) tool support on major RDF library, e.g., RDF4J and Apache Jena.

ljgarcia commented 4 years ago

Hi, I think both SCHACL and ShEx have good community support behind. And, from what I have seen, it does not seem there is a "winner". You can always find pros and cons. I would personally suggest ShEx, I tried both and I found ShEx easier, also found a community on the Life Sciences domain and Wikidata working on ShEx so I did not look at SHACL again.

I would suggest to go with the one we choose based on people working on the subject during the hackathon. It could be both, RDFShape, http://rdfshape.weso.es, converts from one to another.

Regards,

On Thu, May 7, 2020 at 7:05 AM Fajar J. Ekaputra notifications@github.com wrote:

Hi, I would suggest SHACL over ShEx due to two pragmatic reasons: (i) the W3C standard status of SHACL vs W3C community group of ShEx; and (ii) tool support on major RDF library, e.g., RDF4J and Apache Jena.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/RDA-DMP-Common/hackathon-2020/issues/5#issuecomment-625031321, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABJNANDPKWB4PW4SY72YYBLRQI6SLANCNFSM4MPDGU6Q .

ghost commented 4 years ago

Hey there, maybe this meetup about SHACL is relevant for this team: https://www.meetup.com/The-Berlin-Semantic-Web-Meetup-Group/events/270741285/?rv=ea1_v2&_xtd=gatlbWFpbF9jbGlja9oAJGVjZjQzM2ExLTllNjAtNDc5MS04Yjk5LTZhZDBlOTdkYWRlMA

froggypaule commented 4 years ago

I also join this group out of pure interest, as an observer (I am too much of a newbie to contribute at this stage). Thanks!

jomtov commented 4 years ago

Just a thought, if you will be using an already existing ontology for contributor role in the maDMP-schema, now defined as:

"role": {
                                "id": "#/properties/dmp/properties/contributor/items/properties/role",
                                "minItems": 1,
                                "type": "array",
                                "title": "The Role Schema",
                                "description": "Type of contributor",
                                "items": {
                                    "id": "#/properties/dmp/properties/contributor/items/properties/role/items",
                                    "type": "string",
                                    "title": "The Contributor Role(s) Items Schema",
                                    "examples": ["Data Steward"]
                                }, 

would you consider using e.g. this one from casrai: CreDiT ?