Closed rdmpage closed 7 years ago
@rdmpage The Natural History Museum of Los Angeles County is a BHL member. I or another member of the EABL project team will contact them directly to see if they have complete article metadata to give us. Where did you get the metadata for the articles that you already have? Just curious and trying to learn...
Hi Susan, The metadata for the article comes from several sources:
Get Outlook for iOS
_____________________________
From: suwiding notifications@github.com Sent: Wednesday, January 4, 2017 11:59 pm Subject: Re: [rdmpage/biostor] Add article for Contributions in Science ISSN 0459-8113 (#45) To: rdmpage/biostor biostor@noreply.github.com Cc: Roderic Page rdmpage@gmail.com, Mention mention@noreply.github.com
@rdmpage The Natural History Museum of Los Angeles County is a BHL member. I or another member of the EABL project team will contact them directly to see if they have complete article metadata to give us. Where did you get the metadata for the articles that you already have? Just curious and trying to learn...
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
@suwiding To see what progress I've made so far here's a Google Docs spreadsheet I'm using to extract articles https://docs.google.com/spreadsheets/d/1d2FOxKiNnGDf2rt36-x-iIuCtj-tJZln96hZjrkgGiY/edit?usp=sharing see also http://direct.biostor.org/issn/0459-8113
Hi @rdmpage, here are the citations for Contributions in Science articles not already in BHL. We used the Notes field to list the starting page ID number and the Database Provider field to indicate that NHMLAC is the contributor. Please let me know if there are any issues with the file. Thanks!
@marissakings Many thanks for this. A couple of minor things. There doesn't seem to be a field for the journal, something like
JO - Contributions...
Also, ideally the first and last pages would be in separate fields, so that the first page is
SP - first page EP - last page
I can tweak my code to handle both pages in the SP field, and I know that some programs put both pages in the SP field.
None of these issues is a show stopper, so I'll look at adding these articles as soon as I can. Many thanks for putting together this list.
Get Outlook for iOShttps://aka.ms/o0ukef
From: marissakings notifications@github.com Sent: Saturday, July 15, 2017 12:41:59 AM To: rdmpage/biostor Cc: Roderic Page; Mention Subject: Re: [rdmpage/biostor] Add articles for Contributions in Science ISSN 0459-8113 (#45)
Hi @rdmpagehttps://github.com/rdmpage, here are the citations for Contributions in Science articles not already in BHL. We used the Notes field to list the starting page ID number and the Database Provider field to indicate that NHMLAC is the contributor. Please let me know if there are any issues with the file. Thanks!
Corrected_Contributions in Science Citations.txthttps://github.com/rdmpage/biostor/files/1149790/Corrected_Contributions.in.Science.Citations.txt
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/rdmpage/biostor/issues/45#issuecomment-315484550, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AAFFaoXc0bNH64FTgcmO52cBc47JCNHcks5sN-63gaJpZM4Laz93.
@marissakings I've started to add some of these articles, see http://biostor.org/issn/0459-8113/year/1963 I've discovered another "gotcha", namely where to the article starts. Some articles have the cover being page 1, some start numbering the article a couple of pages in. It looks like the file always has the BHL pageID for the cover, which may or ma not be the start of the article. BioStor relies on the page numbering matching the actual pages we need to extract, so this can lead to some problems. I think I can work around this by offsetting the PageID (the N1 field) where needed. I'll let you know what I've processed all the articles in the file.
Oh dear. It sounds like the citations aren't as consistent as I thought they were. I've also found after reviewing what is already in BHL that I left two articles off of the file I sent you (volumes 507 and 508). Since there are gaps with the Journal and Start/End Pages fields, would it be best if I just re-uploaded a corrected version of the text file with all of the citations?
@marissakings No need to redo everything, I can add those two manually.
@rdmpage I feel the need to confess that most of the problems are due to 'help' that I provided to @marissakings . I apologize for the problems.
@suwiding No worries, the journal seems unable to make up it's mind where to start its pagination, so it makes things "interesting". Having the file @marissakings sent is a big help (having the BHL PageIDs saves lots of time), and I'm slowly working through it to add all the articles.
@suwiding I wouldn't have known where to start if not for all of your help, especially with Python! @rdmpage, I can upload a text file with the two missing citations if that would help?
@marissakings Yes, having the two missing citations the would be great.
CIS Citations 507_508.txt @rdmpage Here are the citations for 507-508.
@marissakings and @suwiding No doubt a closer look will uncover some issues, but I think we're pretty much done. Thanks for all your help, it made a huge difference having the metadata available.
Thanks for making this happen @rdmpage!
For future additions (NHMLAC has a few more publications that may get added, and we'd be creating records from scratch again), I just wanted to confirm all of the fields that we would need to have entered - I've attached a sample file for the CIS article Notes on a Brazilian mouse, Blarinomys breviceps (Winge) by Abravaya and Matson. We're currently using the free web version of EndNote to generate the text files in RIS format, and there isn't an End Page field, so I put both the start and end page in the Start Page field. I'm also guessing that if both the start pages and end pages are in the SP field, we wouldn't need to include the start page in the Notes field? Sample RIS.txt
@marissakings Just to be clear, the SP field has the start and end pages for the printed version
SP - 1-8
and the SE field (which I've not seen before) has the start and end BHL PageIDs of the corresponding pages.
SE - 52335475-52335482
if so, this would be fine for BioStor.
depending on how detailed you want to be, adding month and day to the PY field would be handy, taxonomists in particular like precise dates.
@rdmpage - Ah, ok - EndNote has different fields for pages and start pages - attached is a screenshot of the core fields we used when creating a record. If using both the SP and SE fields are fine for BioStor, I'll make a note to do that in the future. I'll make a note to add both day and month as well if that information is handy.
Hi @rdmpage, it looks like BHL harvested the remaining articles overnight. I spotted a few that didn't get uploaded - should I create a new text file with those citations to re-upload?
There was also one strange sorting issue - volume 41 is being displayed between volumes 46 and 48 in BHL (47 is one of the missing volumes). Do you or @suwiding know if this is a problem with the citation or something on BHL's end?
@marissakings I fixed the ordering problem with volume 41 in BHL using the Admin Dashboard. I don't know why it happened.
Lots of articles already in BioStor, but there doesn't seem to be an easily accessible list of articles from this journal.