ssu-readinglists / readinglists

GNU General Public License v3.0
1 stars 0 forks source link

Get isbn data from primo to pull in additional authors #5

Closed ssu-readinglists closed 11 years ago

ssu-readinglists commented 11 years ago

ISBN pull in only brings in first author rather than other named authors - we would like all to appear in the author field. Lauren/Owen need to investigate records/primo tags etc and fields

ostephens commented 11 years ago

No problem doing this if the data is available via Primo. At the moment the author details are retrieved from the following paths in the Primo XML:

/sear:SEGMENTS/sear:JAGROOT/sear:RESULT/sear:DOCSET/sear:DOC/PrimoNMBib/record/addata/aulast /sear:SEGMENTS/sear:JAGROOT/sear:RESULT/sear:DOCSET/sear:DOC/PrimoNMBib/record/addata/aufirst

These are then joined to give the author string for the reference.

It maybe there is a better place to retrieve all of the author details from the Primo XML.

One issue that could cause problems is that library cataloguing practice often lists people related to the item without specifying their role. So if we take all the names we may find that we are unable to differentiate between authors, editors, illustrators, etc.

hhy05 commented 11 years ago

Yes, I see the issue here - here are some examplesc, concentrating on newer records, where only one author is pulled in:

Media and terrorism : global perspectives / edited by Des Freedman and Daya Kishan Thussu - only Freedman makes it in from Add.Entry rather than both

The political economy of terrorism / Walter Enders, Todd Sandler. - only Enders gets pulled in, Sandler is in Add.Entry

Add.Entry seems to be the string that is needed, I will check with cataloguing if there could be any issues.

ostephens commented 11 years ago

These values can be pulled from a few places in the MARC record, and Primo then transforms them - so what we get in the Primo XML is a bit different from what the cataloguers will be entering.

I've had a look at these records and if others follow the same pattern it looks like we might be able to use two fields from the Primo XML to get all the relevant values in a reasonably convenient fashion:

/sear:SEGMENTS/sear:JAGROOT/sear:RESULT/sear:DOCSET/sear:DOC/PrimoNMBib/record/display/creator /sear:SEGMENTS/sear:JAGROOT/sear:RESULT/sear:DOCSET/sear:DOC/PrimoNMBib/record/display/contributor

I think this will give us the same outcome as the Primo display (list of contributors underneath the title) - e.g.: The political economy of terrorism Enders, Walter ; Sandler, Todd

ssu-readinglists commented 11 years ago

Email from cataloguing department indicating potential issues and move from AACR.

So we could either - pull in from the 700 field - but likelihood is, going forward, will end up with duplicated author names. Or, could we somehow pull from the 245 c field in addition to the current 100 field?

Asked Cataloguing to comment: Does this always have all authors in it and is that field likely to continue to have all authors/editors under the new scheme?

In reality, think we are going to have to decide what is most useful going forward for new records:

just one author and we can add in additional if required manually (so only pull from 100 as currently) pull from 245 c so we get all authors, provided all records will have this field? And therefore remove 100 pull in completely, or have as extra? pull from 100 and 700 but be aware that will get duplicated authors and no indication if translators/editors etc

Thoughts?

ostephens commented 11 years ago

I just want to highlight what I said above - Primo already does some transformation - so it isn't possible to just pull things from the MARC fields - we don't have access to '100' and '700' - only the Primo interpretation of the data in those fields.

Since Primo is essentially already trying to deal with this issue, I think we have to focus on the Primo data we want, not the underlying MARC data - we don't want to do the same job twice

ssu-readinglists commented 11 years ago

Sophie believes that we would be 'more correct' in the majority of cases if we pulled from the 100 AND 700 fields - which we believe relate to the creator/contributor in primo as Owen mentioned above [SD - Yes, the creator/contributor fields from Primo would give the 'Primo translated' versions of the 100/700 fields]

245 c details are not considered a good option by either SD or Owen.

I think we currently just pull from creator (add author) which is why generally we only get one author. In the minority of records, we might see additional entries - so illustrators/translators or duplicated author names if we go down this route. However, I think it is easier to delete in myreferences than manually enter so think we would be best to pull in as many authors as we can and then delete what is not needed, rather than only having one author per book going forward.

AF to confirm above re: deletion/addition

Sophie does not think the move to RDA will affect this change #51

ssu-readinglists commented 11 years ago

AF confirms: certainly deleting a name is much easier and less prone to typos but may need to check in the case of translators etc.

I think we should try and then test out.

hhy05 commented 11 years ago

We have tested this and going well at the moment - points noted above - we have already found one record with the extra authors bit in:

The Oxford handbook of happiness David, Susan A. editor of compilation.; Boniwell, Ilona editor of compilation.; Ayers, Amanda Conley editor of compilation. Oxford : Oxford University Press 2013

the extra authors are now being pulled from the 710 field which shows on primo live at the moment in full.

In new RDA catalogue records (we have about 100 of these in the catalogue at the moment and it will continue to grow) all authors/710 entries have something afterwards, like author or contributing editor etc. At the moment these are being pulled into myReferences.

However, on test primo, already set it so that the extra author/editor part does not show. So once these changes go onto live later in the summer, then the pull-in should work fine and just show author names.

This should mean that you only get extra bits for names on newer records, and just a few at the moment, and this will change once changes on primo go ahead so is not a code error.

Once changes on Primo are live, we can test again and see if 'editor of compilation' gets left behind - TO DO under #51

hhy05 commented 11 years ago

Should note that for "section of a book reference", pulls in all authors with this new modification - as authors rather than editors. This is correct as lots of occasions will be just a single authored book and one chapter from this by the same author - if it actually is an edited book with different author, simple to paste across into editors field and add in chapter author instead - no way to differentiate between these types.