Closed SebastinSanty closed 6 years ago
I don't think it is actually possible to get a download URL out of datacite. I kinda knew that going in. This also of-course means integration tests are not possible. (Since resolving the URL correctly is most of what we are testing with those.)
Take a look at https://github.com/datacite/freya/issues/2 where @mfenner is talking about providing it via content-negotiation for "application/zip" but it is not done yet (I believe datacite allows for content negotiation via URL as well as via header which is nice)
Right now I think our go is to provide 95% of the registration block, i.e. everything apart from the URL and checksum, then let the user go to website (The DOI's landing page) find the link manually, and then edit the generated code.
Editting the generated code is already part of our normal usage anyway, as they likely want to change the datadep name and probably edit the message.
Hmm what is actually going on with Figshare. Looks like they do use DataCite generated DOIs (See https://stats.datacite.org/?fq=allocator_facet%3A%22FIGSHARE+-+figshare%22&#tab-datacentres)
And the following works: https://api.datacite.org/works/10.6084/m9.figshare.5350216.v1
What does not is:
Which is associated with the doi: 10.1371/journal.pone.0047999.t004 Which resolves to http://journals.plos.org/plosone/article/figure?id=10.1371/journal.pone.0047999.t004 which is the same table, but on a different site.
So I am guessing figshare rehosted that existing data, with its existing DOI. And so it was never issued a datacite DOI number, which means it does not work with their API.
CrossRef issued that DOI: Their API, is not so great for this https://api.crossref.org/v1/works/10.1371/journal.pone.0047999.t004
I don't think we can content negotiate anything better See https://citation.crosscite.org/docs.html I tried a few.
It might be nice to support DOIs in general via the content-negotiation method. But the things you can get out of any of the providers except DataCite seem less unuseful. (Surprising really since we're only getting basic metadata. So maybe it is just this on entry (10.1371/journal.pone.0047999.t004) that has poor metadata)
So writing down whatever I understood and plan to implement based on your points. Please correct me if I am wrong:
application/vnd.citationstyles.csl+json
as the Accept:
for the GET request we make. This gives a JSON as result for which we already have a provision to parse. There is no XML format combined for all of them (CrossCite, DataCite, mEDRA). We have RDF:XML, but I wouldn't prefer using that because that'll be another pain/overhead.source
attribute. Accordingly, send a request to the source's API and get the final results for creating the register blockI faced an issue though, I tried doing content negotiation as described above. But unfortunately the content-negotiation results which came for DataCite didn't contain the source
attribute. For cross-ref it is working properly.
So writing down whatever I understood and plan to implement based on your points. Please correct me if I am wrong:
Good idea checking. I seem to have mislead you.
DOI <: DataRepo
.Seperately from what you've made here DataCite <: DataRepo
.
Like how we have many ways to generate for DataDryad (DataCite
, DataDryad
, DataDryadWeb
),
a DOI
generator would be an alternative.
If it is a good, one (which I think it can be) it could mean that we delete the current DataCite
generator just to save on maintenance.
The goal of this PR #28 is to add DataCite support, it has done that successfully (well no URL, I suspected that wasn't going to be possible). Once you fix up the the few small things discussed in the review, then this should be good to merge.
I'ld like to see full support for Figshare and DataVerse. OAI-PMH is one path that might do it (though I suspect it also won't let use actually get download URLs)
BTW: cross-negotiate isn't a term that I am familiar with. I think you mean content negotiate
For This PR. Something I think I missed in the code-review before:
it should displace some kind of info("DataCite based generation can only generate partial registration blocks, as DataCite metadata does not (currently) include the URL to the resource. You will have to edit in the URL after generation.")
And it should probably stick in the place as the URL a something like "PUT DOWNLOAD URL HERE"
.
Looks like the test failure are something to do with the Github generator breaking.
Its good that I asked before implementing it in the PR, saved some work of removing it.
Need to merge #31 before this.
@oxinabox Ready to merged if you don't have any reviews.
Merging #28 into master will increase coverage by
0.15%
. The diff coverage is95.65%
.
@@ Coverage Diff @@
## master #28 +/- ##
==========================================
+ Coverage 93.93% 94.09% +0.15%
==========================================
Files 13 14 +1
Lines 231 254 +23
==========================================
+ Hits 217 239 +22
- Misses 14 15 +1
Impacted Files | Coverage Δ | |
---|---|---|
src/DataDepsGenerators.jl | 94.28% <100%> (+0.53%) |
:arrow_up: |
src/DataCite.jl | 95% <95%> (ø) |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update a004321...b6ca9d8. Read the comment docs.
Integration Tests to be added after your first review. Secondly, I am not able to get the urls. Do you have an idea how to get it? There are some hints regarding
resource-type
etc.