gbif / ipt

GBIF Integrated Publishing Toolkit (IPT)
https://www.gbif.org/ipt
Apache License 2.0
124 stars 58 forks source link

Document which DCAT profile and version is used #1817

Open frafra opened 1 year ago

frafra commented 1 year ago

I found that the DCAT endpoint has been implemented in IPT during 2015, and it was targeting DCAT-AP (1.0?), but there is no mention in the IPT repository or documentation about which DCAT version or profile has been targeted.

Reference: https://github.com/inbo/ipt-dcat

peterdesmet commented 1 year ago

👋 I coordinated the 2015 summer of code student project that added a DCAT feed to the IPT (with issues indeed documented in https://github.com/inbo/ipt-dcat). I don't remember what version of DCAT-AP we were implementing at the time, but given the release dates of the versions of DCAT-AP it is probably version 1.0.

The DCAT feed has indeed always been a somewhat hidden and undocumented (only mention is here) feature of the IPT and has to my knowledge not been updated since.

It would probably be good to know who makes use of the DCAT feed feature of the IPT to assess if it needs to be maintained, updated or deprecated.

🙋‍♂️ We make use of it at INBO, but for a fairly outdated workflow and we could probably do without it.

frafra commented 1 year ago

Hi! Thank you! I see that there have been some fixes, so I guess someone is using it :)

mike-podolskiy90 commented 1 year ago

Thank you @frafra for raising this issue I would assume DCAT is not that popular and I'd personally prefer to deprecate. But it of course depends on how many users still use it

marc-portier commented 1 year ago

My take:

DCAT is an open standard for sharing datasets, and was adopted into schema.org. It is also the EU adopted way to share all datasets (not only ones about biodiversity) The upshot of that is all mayor search engines know how to deal with it. (try convincing them to harvest ipt and read dwca)

Having a working dcat support in ipt nodes offers a way to 'publish' your datasets up into Google Datasets as well as adhere to EU standards for sharing data with other government bodies. All of which should be considered as a useful thing.

So the DCAT support is opening up the ipt nodes to #openscience. Almost naturally it looks at use cases outside the strict science domain: cases we might be unfamiliar with, serving an audience we do not have direct contact with. But that is a good thing.

Being outside our focus though, we under-document, and neglect --> actually the dcat feed still does not parse as a valid text/turtle. So before questioning its popularity or actual use, we should do an effort to make it usable?

IMHO rather than deprecating it, we should give it proper attention. Jokingly: If we are only serving our own community we might as well plea to go back at writing all biodiversity papers in Latin :)

See also:

MattBlissett commented 1 year ago

DCAT isn't used by many institutions, but it is important for those that do use it.

The Content-Type on https://ipt.gbif-uat.org/dcat is incorrect, it should be text/turtle or maybe text/turtle;charset=UTF-8

More importantly, http://ttl.summerofcode.be/ reports syntax errors on https://ipt.gbif.org/dcat although not on https://ipt.gbif-uat.org/dcat. (Do we have unreleased changes?)

http://www.dcat.be:8080/validator/ reports errors on https://ipt.gbif-uat.org/dcat

If we are sure what version we implement, we can add it to https://ipt.gbif.org/manual/en/ipt/2.5/faq#how-can-i-export-a-list-of-resources-published-in-my-ipt

albenson-usgs commented 1 year ago

But aren't datasets already getting picked up by Google Datasets via the JSON-LD metadata provided by GBIF?

marc-portier commented 1 year ago

@albenson-usgs thx for pointing out - very valid remark.

Given their similar goal (and the fact that dcat can be serialized in json-ld as well) the better approach would be to have the implementations aligned.

mike-podolskiy90 commented 1 year ago

I've changed content type to text/turtle;charset=UTF-8 and I also fixed validation errors here #1816

mike-podolskiy90 commented 1 year ago

I guess version is indeed 1.0. I'll investigate

frafra commented 1 year ago

Thank you @mike-podolskiy90 for the effort :)