Closed karstenpeters closed 3 years ago
Karsen,
Thank you very much for your report; we'll certainly try to address these things. In the meantime, here are some 'answers' to the reasons some of these may be failing.
@id
field seems like it should be valid as well.isAccessibleForFree
property, so this will work shortly *.contact
-- but instead it should probably be looking for a ContactPoint. It seems clear that certain information attached to an Organization should also be suitable for this metric. However contact information on a webpage is not something that can be automatically asserted, the automated assessments are currently restricted to machine-readable contracts, this webpage might make sense attached to a ContactPoint (i.e. { "@type": "ContactPoint", "url": ... }
).citation
requirement in the presence of a DOI, and may investigate i.e. Google Dataset Search to see if the lack of citation
has implications for Findability.*I suspect the ones that should be working that aren't is because of the @context
being a list, but this is valid json-ld syntax so this will certainly be addressed.
Update:
@id
@context
being a listContactPoint.url
to addresscitation
, further review is necessary before choosing to allow doi
s@context
being a listWith no further action items, I'll close this issue. Do let us know if you have any further questions.
Hi,
while doing a number of tests on datasets hosted in our repository (WDCC, World Data Center for Climate), I have come across some issues regarding the automatic parts of the evaluation. I am using the "FAIRshake dataset rubric" for assessment.
I will take the following dataset as example:
https://cera-www.dkrz.de/WDCC/ui/cerasearch/entry?acronym=ssp585_r2i1p1f1-eh6_rcm_c6
The json-ld part is seen here:
https://cera-www.dkrz.de/WDCC/ui/cerasearch/entry?acronym=ssp585_r2i1p1f1-eh6_rcm_c6&exporttype=json-ld
The json-ld metadata comply with the schema.org standard.
1) Dataset identification: The json-ld contains ample information regarding the identification of the dataset, however, the first test fails, with the message "json-ld WebSite.identifier DataCatalog.identifier Dataset.identifier not found"
2) Dataset access: Access to datasets hosted in WDCC is free of charge. We do however require authentication in order to access our datasets, which immediately implies that we do not provide URLs to the datasets in the json-ld for practical purposes (some of the datasets are >100TB in size). However, the json-ld metadata do contain the information "isAccessibleForFree": true". Nevertheless, the FAIRshake test for "The dataset can be downloaded for free from the repository" fails. Is it strictly required for FAIRshake to actually be able to download the data or would the information we provide in the json-ld also suffice to automatically pass that test?
3) The json-ld contains a long list of dataset creators (albeit with no email attached), and the landing page also contains a "contacts" tab. However, the FAIRshake test "Contact information is provided for the creator(s) of the dataset." does not automatically recognise this information.
4) We clearly provide citation information on the landing page of the dataset, although the citation only refers to the parent project. So in this case, the dataset can only be cited as part of a collection - which is a viable approach. This information is also somewhat contained in the json-ld: ""isPartOf": [ { "@type": "Dataset", "@id": "https://doi.org/10.26050/WDCC/RCM_CMIP6_SSP585-HR_r2i1p1f1", "name": "CMIP6 ScenarioMIP DWD MPI-ESM1-2-HR ssp585_r2i1p1f1 - RCM-forcing data" } ]," However, the FAIRshake test "Information is provided describing how to cite the dataset." does not recognise this information.
5) Licensing: We also provide licensing information on the landing page of the dataset. On the landing page, this is called "use constraints", but it refers to CC-BY 4.0. In the json-ld, the syntax is as follows: ""license": "https://creativecommons.org/licenses/by/4.0/"," However, the FAIRshake test "Licensing information is provided on the datasets landing page." does not detect any licensing information and yields "No" as result.
The good thing about FAIRshake is, that all the answers which were not properly filled by the algorithm can still be amended manually. But it would of course be more attractive, if FAIRshake would recognise the information provided on both the landing page and the json-ld.
I hope this may help in improving FAIRshake.
Thanks very much, Karsten