soilwise-he / pycsw

an implentation of pycsw for soilwise
https://pycsw.org
MIT License
0 stars 0 forks source link

some records have no linkage, but id is a link #9

Open pvgenuchten opened 1 month ago

pvgenuchten commented 1 month ago

some records extracted from DOI and ESDAC do not have linkage to a resource, however the identifier itself is a linkage to the resource

in such cases, the UI should display the identifier as a link,

however with the recent workaround the http protocol was stripped from identifier links, so you can't detect anymore if it is a link by evaluating if identifier starts with http

see for example https://soilwise-he.containers.wur.nl/cat/collections/metadata:main/items/esdac.jrc.ec.europa.eu/content/Soil_erosion_by_wind

pvgenuchten commented 1 week ago

this happens with records imported from cordis/openaire, the record has an identifier (typically a DOI), but no reference to the item itself

in Dublin Core a typical approach is to add link to the described item as a dcterms:reference (pycsw will pick it up if advertised in this way)

suggestion is to add identifier also as a dcterms:reference

pvgenuchten commented 1 week ago

the doi linkage has been added to the Cordis harvested records, @DajanaSnopkova can you verify this operates as expected?

DajanaSnopkova commented 1 week ago

I am not sure, how to look only for Cordis records... But when I try to search by documents, for example this records' link: https://soilwise-he.containers.wur.nl/cat/collections/metadata:main/items/10.1002/2016WR020175 shows "Exception item not found".

BerkvensNick commented 1 week ago

Hi @pvgenuchten, I tried to find documents from Cordis via a Sparql query Iin the read.me of the Cordis folder in the harvester: https://github.com/soilwise-he/harvesters/tree/main/cordis:

PREFIX eurio:< http://data.europa.eu/s66#> PREFIX rdf: < http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX dcterms: < http://purl.org/dc/terms/> CONSTRUCT { ?result dcterms:title ?title } WHERE { ?project a eurio:Project. ?project eurio:abstract ?abstract. ?project eurio:hasResult ?result. ?result rdf:type ?type. ?result eurio:doi ?doi. ?result eurio:title ?title. FILTER regex(?abstract, "Soil", "i") FILTER regex(?type, eurio:ProjectPublication) }

I ran this at the endpoint: https://cordis.europa.eu/datalab/sparql-endpoint/en

when I try some of the articles from the retrieved records:

I can find a link for the records at the bottom of the detailed record:

several have a working doi, but a second link that doesn't seem to work:

and I found one without a working link:

BerkvensNick commented 1 week ago

randomly looking at other documents I find a link at the bottom of the detail of the record that doesn't seem to work. Does this mean they do not originate from Cordis and, if so, can we also add a link for these records?

when hitting the link I get the webpage: image

pvgenuchten commented 1 week ago

the problem you're pointing to is related to #10, we're going to solve it seperately but from your comments i understand that not all linkages from cordis have been restored thanks for checking I also noticed that quite some esdac records have no linkage