OpenTreeOfLife / opentree

Opentree browsing and curation web site. For overarching or cross-repo concerns, please see the 'germinator' repo.
http://tree.opentreeoflife.org/
BSD 2-Clause "Simplified" License
109 stars 26 forks source link

Use https: for DOI hyperlinks #1118

Closed jar398 closed 7 years ago

jar398 commented 7 years ago

See http://blog.crossref.org/2017/01/linking-dois-using-https-the-background-to-crossrefs-new-guidelines.html

Now remember that what someone sees on the web page, the hyperlink, and what we store in phylesystem are three different things. Not that I would object, but changing phylesystem seems painful, unnecessary, and possibly not compatible with software that reads the nexson. But the strings from the nexson can be converted from http: to https: before the HTML (or DOM) gets created.

Also, it looks like crossref, somewhat annoyingly, wants us to get rid of the 'dx' in the domain name: http://blog.crossref.org/2016/09/new-crossref-doi-display-guidelines.html The 'new best practice' is e.g. https://doi.org/10.1629/22161 The two changes to the hyperlink URL can be made at the same time.

jimallman commented 7 years ago

If I knew these recommendations were going to change so frequently, I would have advocated harder for storing bare DOIs. 😕 Since our old URIs will apparently be supported going forward, I'll do as you suggest and modify just the live hyperlinks in the app.

Not that I would object, but changing phylesystem seems painful, unnecessary, and possibly not compatible with software that reads the nexson.

If we're concerned about backward compatibility with nexson consumers, I can keep building and storing old-style URIs in phylesystem. This is weird, but probably keeps everyone happy. I'll make sure our duplicate-study tests focus on the "bare" DOI, regardless of the surrounding URI.

jimallman commented 7 years ago

I'll make sure our duplicate-study tests focus on the "bare" DOI, regardless of the surrounding URI.

As it turns out, the duplicate-study test is for an exact match of the value stored in phylesystem for ot:studyPublication (or in some cases, ot:dataDeposit). This means that in most cases, we're testing for the old-style CrossRef URI, so storing old and new would require a smarter/fuzzy query to oti/otindex, or two separate queries for the old and new CrossRef URIs if either is found in the current study.

jar398 commented 7 years ago

Maybe I wasn't clear? I was suggesting no change to the representation in phylesystem. The only change would be to the value of the href= parameter in the HTML.

jar398 commented 7 years ago

There would have to be a new normalizaton case, to change http://doi.org/ and https://doi.org/ to http://dx.doi.org/, before the URL gets stored in phylesystem or in a feedback comment.

jimallman commented 7 years ago

I was suggesting no change to the representation in phylesystem.

I understand. It just makes me itch, since we force DOIs to URLs in the editor, where a curator will see and approve them, but will then either store or display them differently. Which would you like the curator to see in the input widget?

(Also, it makes the code rather ugly, since I'll need to distinguish between our stored CrossRef URI and the displayed version. But this is manageable.)

jar398 commented 7 years ago

On Mar 2, 2017, at 3:40 PM, Jim Allman notifications@github.com wrote:

I was suggesting no change to the representation in phylesystem.

I understand. It just makes me itch, since we force DOIs to URLs in the editor, where a curator will see and approve them, but will then either store or display them differently. Which would you like the curator to see in the input widget?

Hmm. This is a rather small thing and I don’t want to overthink it. You are persuading me that maybe the only change should be in the coercion code on input, with everything else, including the display, left the same.

Users will see https://doi.org/ (modern form) replaced by http://dx.doi.org/ (archaic but will aways be supported by IDF) on entry, and this may disorient a few of them, but they’ll get used to it. The main disadvantage would be that we’d fail the https-everywhere movement, and some people may be put at risk (e.g. a skeptical teenager in a creationist household, where the parents, or other authorities, do packet sniffing).

We would have had to have such a DOI-to-URL feature anyhow, if the DOI had been stored in a different field, or if it used a different syntax such as do:10….

(Also, it makes the code rather ugly, since I'll need to distinguish between our stored CrossRef URI and the displayed version. But this is manageable.)

I was thinking you’d just have a single subroutine to be used at the last minute in any spot that displays URLs, and the knowledge of what href-URL to use for any given DOI-URL would be isolated there.

Making a pass over phylesystem seems unwise and not worth the effort, and allowing both syntaxes in the stored form seems really confusing to me.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

jimallman commented 7 years ago

I was thinking you’d just have a single subroutine to be used at the last minute in any spot that displays URLs, and the knowledge of what href-URL to use for any given DOI-URL would be isolated there.

Yes, this is exactly what I'm adding now. As you say, no point in over-thinking this. Changing only displayed URLs is pretty simple and goof-proof, except for (as noted) some possible curator confusion. I've found a few outliers, like DOIs in the taxonomy-version release notes, but I'm going to leave those for possible manual updates since the old form will work.

jimallman commented 7 years ago

Good news! In testing my changes, I realized that a curator who submits an old-style CrossRef URI won't be surprised right away, as the curation app accepts any valid URL without changes. It's only later (in read-only views) that it will be updated to the new format, which seems gentle enough.