Closed albangaignard closed 6 years ago
This is a known issue (see https://github.com/bio-tools/biotoolsRegistry/issues/281) but perhaps you find other issues here, too. @hansioan will investigate.
@joncison @albangaignard
I've looked at the doi links and with the exception of the links that had a whitespace added by mistake e.g.:
https://dx.doi.org/ 10.1186/1471-2164-16-S6-S2
which I've now fixed, all the other links actually work and forward to a publication. I don't know what to say... I think it's an RDF issue... these doi links are what they are...
I second this, @albangaignard can you pls. check your diagnostics / clarify what the issue is hrtr, e.g. with https://dx.doi.org/10.1534/genetics.112.144204
which resolves just fine ??
It seems to be an error produced by the python RDF library. Still few of them seem to be problematic, e.g. :
I'm investigating this.
Thanks, once you're done pls. paste a short-list here of problem cases for @hansioan to fix.
@albangaignard You were right about the first one, but this one: https://dx.doi.org/10.1002/1097-0134(20001101)41:2<224::AID-PROT70>3.0.CO;2-Z actually works.
You're right, the second one is ok.
The RDF serializer is not happy with "<" and ">" since these characters are used in turtle triples to delimit URIs e.g. <http://node1> <http://hasName> "a name" .
A workarround could be to consider these URIs as RDF literals (string value). However it would "break" the use of DOIs as RDF nodes.
This is the only issue I've seen while processing the first 10k entries.
I found this thread interesting https://stackoverflow.com/questions/1547899/which-characters-make-a-url-invalid
From http://www.ietf.org/rfc/rfc1738.txt (URLs)
The characters "<" and ">" are unsafe because they are used as the delimiters around URLs in free text
"<" and ">" are excluded in http://www.ietf.org/rfc/rfc2396.txt (URIs)
I close this now @albangaignard but reopen if you find other problems
When producing RDF from biotools content, I tried to prefix all the DOIs with https://dx.doi.org/ so that papers can be dereferenced.
For some of the DOIs, they can't be directly transformed into URIs :