plazi / Biodiversity-Literature-Repository

covers the creating, maintenance and upload to the BLR
3 stars 0 forks source link

missing figures in metadata #56

Closed myrmoteras closed 5 years ago

myrmoteras commented 5 years ago

Why has this EJT-557 no figures in the related items (there are >80)

https://zenodo.org/record/3467434#.XZNysEYzbAS

gsautter commented 5 years ago

That's the other side of the glitch mentioned in #55 ... effect of the glitch was (fixed now) that (since yesterday) the numbers of figure depositions were associated with the document (PDF) deposition, rather than with the figure captions they really belong to.

This is fixed now, and I'm in the process of cleaning up the 32 affected documents (see #55).

gsautter commented 5 years ago

Got it ... the update that links the PDF deposition to the figure depositions gets blocked because it allso attempts to add the isSourceOf relationship to the corresponding GBIF dataset. The latter actually is the intended behavior, and was the whole point of centralizing write-through of external links to documents and then also their corresponding Zenodo depositions. The problem is that both Zenodo systems (sandbox and live) reject isSourceOf as an invalid relationship (see also yesterday's mail exchange with Alex).

This one is for @slint to resolve, I will fire out the updates soon as he gives me a green light.

myrmoteras commented 5 years ago

@gsautter so what to do, because all the new uploads are now missing the links to the figures

myrmoteras commented 5 years ago

@slint might it be possible to look into this isSourceOf issue?

slint commented 5 years ago

We're deploying the fix for that today. I'll notify @gsautter via email when it goes on the live system

gsautter commented 5 years ago

If the fix doesn't come out in time, I can filter the GBIF dataset ID for now (temporarily exclude it from the deposition metadata) and add it once it works. Up to @slint to tell me whether to do that or to wait for the fix.

gsautter commented 5 years ago

The hazards of catching up in chronological order ... only found @slint's green light mail now. Updates are underway, should be done within the hour.

gsautter commented 5 years ago

@slint is there a way of updating metadata more quickly?

slint commented 5 years ago

@gsautter Unfortunately, updating a record requires 3 REST API calls at the moment (edit, update, publish), and they cannot be "batched" on a record or multi-record level.

gsautter commented 5 years ago

I get the update part, will try and speed it up on my end as much as I can (only ever re-open a deposition if there is an actual update, etc.).

The question is, however, if there is a way of overcoming the problems you explained in your mail ... increase the connection pool? Use a secondary database (and replication) as the basis for indexing? Deffer indexing if database connections run low?

myrmoteras commented 5 years ago

@gsautter the figures are not yet included in the article metadata. https://zenodo.org/record/3474305#.XZpRgEYzZaQ Does this have an effect on the taxpub outuput @tcatapano will create from the GG XML?

Do you have a chance when this is being fixed - We should finish all the this years EJT by Oct 9 and thus would appreciate if this fix could be made.

gsautter commented 5 years ago

The figures are there, as DOIs ... right below the treatment HTTP URIs: image