stream-project / WP4R

Work Package 4 (Leitung:Infai)
1 stars 0 forks source link

Extend the DCAT to CKAN mapping #31

Open TBoonX opened 3 years ago

TBoonX commented 3 years ago

We need to extend and change the mapping because we need more metadata and some are wrong.

TBoonX commented 3 years ago

I started to work on a custom profile, but I was not able yet to enable it. I have no clue where to continue and will come to this task when I have one.

TBoonX commented 3 years ago

I found the issue. I will continue with this task

TBoonX commented 3 years ago

The DCAT Extension is using the api differently from what we thought, thus either the endpoints have to be changed or part of the extension has to be rewritten. (only the /catalog route is used)

TBoonX commented 3 years ago

FHI did adopt the NOMAD DCAT API and I did adopt the code. The harvest is good enough for now. I will redo the harvest on the server when the NOMAD API gets its new change.

TBoonX commented 2 years ago

@markus1978 is DOI (see description of this issue) included in the RDF return? If yes, how? and if not, do you have the time to work on it in the next month or so? Thanks.

markus1978 commented 2 years ago

No, currently I don't see any DOI in our mapping to DCAT

TBoonX commented 2 years ago

No, currently I don't see any DOI in our mapping to DCAT

Thanks for the info. I have an issue about verifying that DOIs are also read from the DCAT interface: https://github.com/stream-project/WP4R/issues/36 There is also a list of datasets which have DOIs. @markus1978 I had the impression that this is quite important, thus please discuss this with Carsten or so. If the DOI is also in the DCAT interface, then the change on my side will be quite fast.

@yoavnash is my impression correct, that DSMS will not provide DOIs in the lifetime of this project? If yes then you could think about at least supporting it via your interface in order to report this at least.

yoavnash commented 2 years ago

Matthias suggests that we do it via TIB. I will send an email to Tatyana asking her if it's possible.

markus1978 commented 2 years ago

I added "something" to the nomad dcat API.

I could not use identifier on our dcat:Dataset objects. our dcat:Dataset objects are calculations. Calculations do not get DOIs on nomad. Our users curate many calculations into larger nomad datasets. Those have a DOI. Basically a DOI for a nomad calculation (or dcat:Dataset) is not unique. I added dcat:Distribution to those calculations that have a DOI (via a nomad dataset). The distribution is representing the nomad dataset and is using the respective DOI as an identifier. Note that only the minority of nomad calculations are part of a nomad dataset with DOI. Here is one example: https://nomad-lab.eu/prod/v1/dcat/datasets/zzZhsOkL-rbZHLOLUZswRPqEp-Uw?format=turtle

Also note the different api prefix. This and all future changes will only be available for nomad v1 (https://nomad-lab.eu/prod/v1). The old version (https://nomad-lab.eu/prod/rae) is still running though and will probably be running for a few more months. At some point it will be replaced by a simple redirect (301) towards the new URL.

TBoonX commented 2 years ago

@markus1978 Thank you for the information and the code change! I will now use the new API. The DOI is already shown in our CKAN instance. I have a change request for the distributions:

If you want to see how it looks like in CKAN: https://stream-dataspace.net/dataset/brtati2 (note: old NOMAD DCAT API used )