rdmpage / biostor

Open access articles extracted from the Biodiversity Heritage Library
http://biostor.org
5 stars 2 forks source link

Define articles in Memoirs of the Carnegie Museum, ISSN 0885-4645. RIS provided. #61

Open suwiding opened 7 years ago

suwiding commented 7 years ago

@rdmpage We miss you! Attached is a zip file containing metadata for the undefined articles in this publication in RIS format. (I realize that you already defined a bunch of articles in this publication and have avoided duplication.) The metadata contains the BHL starting pageid in the Notes field. (This seemed important in this publication because there were many duplicate page numbers, e.g. page 1, in the BHL item. The metadata also contains the name of the Museum in the DP field in the RIS since they gave us all of their metadata.

memoirs_carnegie.zip

rdmpage commented 7 years ago

@suwiding I've added these articles. Took a little fussing as most of the articles have plates so page ranges needed to be edited manually to include them.

suwiding commented 7 years ago

@rdmpage Thanks for adding the articles. Is there anything I could have done to make the metadata better and minimize the work for you? I have the ability to edit both page metadata and RIS contents. What would help?

rdmpage commented 7 years ago

@suwiding There’s nothing that can be easily done in an RIS file. The problem is that typically metadata says an article has a start and and end page, and the easiest assumption is that all pages (and only those pages) between the starting and ending pages belong in the article. If there are plates then this assumption often fails.

I have a crude local tool that displays all the pages in a scanned item and lets me select the ones that belong in an article. I guess I could polish that up and make it more usable. If, for example, it could output a list of all BHL PageIDs in an article, and that list could be added t the RIS file (in a “notes” field, say) then that might be a way forward. Would take a little work to do. Give that @mlichtenberg is looking at adding article finding to BHL, maybe this is something to think about i the context of that project. In other words, I’m not sure how much energy to invest in improving this versus relying on small-scale manual hacks as at present.

On 2 May 2017, at 01:08, suwiding notifications@github.com wrote:

@rdmpage https://github.com/rdmpage Thanks for adding the articles. Is there anything I could have done to make the metadata better and minimize the work for you? I have the ability to edit both page metadata and RIS contents. What would help?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rdmpage/biostor/issues/61#issuecomment-298465006, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFFaka-kIlI4FPxWfxuFyHjP0GgonCVks5r1nQLgaJpZM4NMzi1.