plazi / Biodiversity-Literature-Repository

covers the creating, maintenance and upload to the BLR
3 stars 0 forks source link

Treatment with article DOI: https://zenodo.org/record/4469666 #96

Open myrmoteras opened 3 years ago

myrmoteras commented 3 years ago

@gsautter This treatment https://zenodo.org/record/4469666 got the paper's DOI https://doi.org/10.11646/zootaxa.4919.1.1 , and because of that, one of the failed transits for the paper in the Gatekeeper https://tb.plazi.org/GgServer/dtpStats/stats?outputFields=doc.updateDate+transits.result+problems.transDest+problems.probDescription+docDoc.name+docDoc.uploadUser+docDoc.uploadDate+docBib.source&groupingFields=doc.updateDate+transits.result+problems.transDest+problems.probDescription+docDoc.name+docDoc.uploadUser+docDoc.uploadDate+docBib.source&FP-transits.result=Failure&FP-docDoc.uploadUser=plazi%20plazi_server&FP-docDoc.uploadDate=%222021-01-26%22&format=HTML is the DOI duplication in Zenodo (link, check paper zootaxa.4919.1.1.pdf).

D94BFFF8813AFFDE7016D56B781FFFE2

Can you please help @gsautter ?

flsimoes commented 3 years ago

Just a bit of extra info:

Inspecting the treatments for this paper, I noticed that all of them "tried" to attach to the paper's DOI image image

For comparison, this is from some other paper: image

gsautter commented 3 years ago

OK, this is really weird ... never saw anything like this happen before, as the treatment exporter does strip the article DOI from the bibliographic metadata of the article (which is of course preserved in the treatment) before using parts of it for the treatment (authors, year, journal name, etc.) metadata ... hard to tell how this might have happened, unless the DOI was manually added to the treatment at some point.

Either way, I think @slint will have to remove the DOI from the deposition (or delete the deposition altogether), as I cannot do through the API in any kind of way, and we have to remove the DOI from the treatment on our end as well.

slint commented 3 years ago

@gsautter I've removed the Zenodo record completely, since we cannot easily just remove the DOI (and e.g. replace it with the one that would be minted by Zenodo).

slint commented 3 years ago

It should be possible now to create a new Zenodo record with the DOI 10.11646/zootaxa.4919.1.1 and upload the paper on it.

flsimoes commented 3 years ago

Thank you very much @slint ! @gsautter @myrmoteras before reuploading the file, is there any chance we can use the IMF we had (excluding the ID links) which I saved locally before deletion?

This way we would avoid having to redo all the work we did with the QC.

gsautter commented 3 years ago

With a little digging through the version history of http://treatment.plazi.org/GgServer/html/257287808139FFDD7081D15D7C46FACA (http://treatment.plazi.org/GgServer/html/257287808139FFDD7081D15D7C46FACA), I found this is where the DOI to added to the treatment:

Looks like this was definitely done manually, and I don't think any of our gizmos would do something to this avail (just double checked the HTML previewer, which is the only gizmo that emulated treatment extraction).

flsimoes commented 3 years ago

That's really weird. It looks like the parameter was added to every annotation in the paper. Even things like taxonomicNames have the ID-DOI on them.

myrmoteras commented 3 years ago

@flsimoes can we reconstruct the history of this document?

@gsautter do you know, who did the changes you list?

flsimoes commented 3 years ago

@flsimoes can we reconstruct the history of this document?

  • @myrmoteras uploaded the doc to server yesterday morning
  • batch processing
  • @diegojalvares did the QC (?), if so we reconstruct what in terms of dois have been done, if at all?
  • did anybody else touch the file?

@gsautter do you know, who did the changes you list?

There's an extra detail there actually:

gsautter commented 3 years ago

@slint thanks for cleaning up the DOI, looks like the article was uploaded successfully now (I see a link scheduled for write-back in the database).

I cleaned up the ID-DOI attributes of the treatments where required (some did have a Zenodo issued DOI, most likely the ones that uploaded right after batch processing), and I see the treatments exporting in the server console.

gsautter commented 3 years ago

There's an extra detail there actually:

* new update was shipped (it had the metadata bug)
* @myrmoteras uploaded the doc to server yesterday morning
* all documents uploaded on the day were missing DOI's, but were held by the gatekeeper because of that (among other errors)
* @diegojalvares started the QC (a lengthy one, by the way), and he had to save the file a couple of times because GGI was crashing in his computer.
* In the meantime, a new update, fixing the bugs, was shipped
* The problem, according to the log, happened between versions 5 and 6
* Diego continued and finished the QC.
* I believe no one else touched the file

@flsimoes thanks for the details ... I don't think the update had anything to do with the article DOI being added to the treatments, especially since the update only concerned unrelated parts of GGI, and also because there is no gizmo in GGI that would add a DOI attribute to anything but the document proper - it's the server side write-back service that adds the DOIs (and related attributes) to captions and treatments.

Regarding GGI crashing on @diegojalvares machine, might it be possible I get a glimpse at the logs? Would be good to know what is going wrong there.

flsimoes commented 3 years ago

There's an extra detail there actually:

* new update was shipped (it had the metadata bug)
* @myrmoteras uploaded the doc to server yesterday morning
* all documents uploaded on the day were missing DOI's, but were held by the gatekeeper because of that (among other errors)
* @diegojalvares started the QC (a lengthy one, by the way), and he had to save the file a couple of times because GGI was crashing in his computer.
* In the meantime, a new update, fixing the bugs, was shipped
* The problem, according to the log, happened between versions 5 and 6
* Diego continued and finished the QC.
* I believe no one else touched the file

@flsimoes thanks for the details ... I don't think the update had anything to do with the article DOI being added to the treatments, especially since the update only concerned unrelated parts of GGI, and also because there is no gizmo in GGI that would add a DOI attribute to anything but the document proper - it's the server side write-back service that adds the DOIs (and related attributes) to captions and treatments.

Sure! I didn't claim this was the cause. Just felt this was an important detail nonetheless.

Regarding GGI crashing on @diegojalvares machine, might it be possible I get a glimpse at the logs? Would be good to know what is going wrong there.

I'll see if we can recover anything, but I believe it is more to do with his PC's RAM than with GGI itself, specially when documents are too heavy like these monographs.

gsautter commented 3 years ago

I'll see if we can recover anything, but I believe it is more to do with his PC's RAM than with GGI itself, specially when documents are too heavy like these monographs.

In any case, you might want to clear his server interchange IMF cache to make sure to get rid of any compromised data ... located at <GGI>/Configurations/Default.image/Plugins/ImsDocumentIOData/cache ... just delete anything in there, it'll be recreated when starting up GGI afterwards.