plazi / Biodiversity-Literature-Repository

covers the creating, maintenance and upload to the BLR
3 stars 0 forks source link

query images of a particular journal #25

Open myrmoteras opened 6 years ago

myrmoteras commented 6 years ago

@gsautter here is a question that is coming up: Can you provide a list of all the images from a particuar journal in BLR?

Right now, this is not possible without a more complex search. The reason is, that nowhere in the deposit is the source journal listed. Do you have an idea how we can make this happen?

For example, for EJT we should be able to provide a search, that would return all the EJT images, if not more refined, for a particular year or so.

myrmoteras commented 6 years ago

Such query which extends beyond one record (e.g. images which are related identifier of a record which has "journal" specified) is probably not possible as a single query (at least I'm not aware of that).

Do you want to have this result for the image visualization or for searching for "duplicate" images?

In latter case, what you can do is to get those journal records from REST API programmatically:

https://zenodo.org/api/records/?page=1&size=200&q=(journal.title:%20%22Candollea%22)%20OR%20(journal.title:%20%22European%20Journal%20of%20Taxonomy%22)

And then go through records one-by-one and further get the image from related identifiers.

This is assuming that the related identifier link from paper -> image exists.

On our end when we have an issue like that, i.e. a lot of somewhat similar records that need to be fixed, we usually develop a custom script or CLI tool and execute it on the data internally.

In your case my first suggestion - fetch all journal papers and then traverse them to get the related image record URLs - might be the easiest.

gsautter commented 6 years ago

Listing all the figure depositions from a particular journal in a straightforward fashion might turn out a bit difficult, as image metadata (afaik) doesn't take a journal name ... Afraid there has to be a join to the host publication and a filter on the journal name in the latter. Not sure whether or not the Zenodo API supports such queries ... Lars and Tim should know more here, either way.

However, there might be a way via our own stats (http://tb.plazi.org/GgServer/dioStats) ... we could very well add a field set for figures, which would then facilitate the aforementioned join on our own database, rather than using the Zenodo API.

myrmoteras commented 6 years ago

we should consider this, since this is what a publisher of course would like to see, especially if he pays us to deposit his stuff. tbd in Bern

gsautter commented 6 years ago

I guess you're talking about the stats?

gsautter commented 6 years ago

Which fields would you like to have for figures? I'd at least record the following, please extend:

myrmoteras commented 6 years ago

stats most likely. We haven't been a feeback from Lars yet. Alternaitvely would be to add in the journal section of the metadata of the image deposit the journal name, issue, and page number?

gsautter commented 6 years ago

Not sure an image deposition takes a journal name ...

myrmoteras commented 6 years ago

when you open the figure deposit for editing, then you have the option to add journal data image

gsautter commented 6 years ago

OK, I can surely add these attributes to the figure depositions. Might be it wasn't available when we built the uploader ...

gsautter commented 6 years ago

The journal name (for books, the publisher) will be included in figure metadata after the next server restart. That is for all depositions that get created or updated thereafter.

gsautter commented 6 years ago

Still not sure, however, how to query this on the Zenodo API. Should I still create the stats table?

myrmoteras commented 6 years ago

Yes, if this is possible. Different approaches