ssu-readinglists / readinglists

GNU General Public License v3.0
1 stars 0 forks source link

for ebooks, additional information could be pulled in...? #59

Closed hhy05 closed 11 years ago

hhy05 commented 11 years ago

Is it possible, for anything that is pulled in from primo that is an ebook for

online to be added into the availability field

DawsonEra or My-i-Library or alternatives pulled into the database field? (The database name is indicated in the links - click here for more information about ....)

ostephens commented 11 years ago

Can you point me to an example of a DawsonEra and a MyiLibrary record?

hhy05 commented 11 years ago

http://catalogue.solent.ac.uk/primo_library/libweb/action/display.do?fn=display&displayMode=full&vid=SSUVU01&tabs=viewOnlineTab&doc=SSU01009606956 - dawsonera

http://catalogue.solent.ac.uk/primo_library/libweb/action/display.do?fn=display&displayMode=full&vid=SSUVU01&tabs=viewOnlineTab&doc=SSU01009603522 - credo reference

http://catalogue.solent.ac.uk/primo_library/libweb/action/display.do?fn=display&displayMode=full&vid=SSUVU01&tabs=viewOnlineTab&doc=SSU01009659393 - myilibrary

I think there are others too....

http://catalogue.solent.ac.uk/primo_library/libweb/action/display.do?fn=display&displayMode=full&vid=SSUVU01&tabs=viewOnlineTab&doc=SSU01009639706 - ebrary

http://catalogue.solent.ac.uk/primo_library/libweb/action/display.do?fn=display&displayMode=full&vid=SSUVU01&tabs=viewOnlineTab&doc=SSU01009595346 - safari

I will find out if any more from cataloging but the database name displays for all in the same field to 'find out more out' under the view online tab (2nd link)

We already have a slightly different primo pull-in link for ebooks so could this be used to auto-add online at the same time?

One additional supplier too: Referex: http://catalogue.solent.ac.uk/primo_library/libweb/action/display.do?fn=display&displayMode=full&vid=SSUVU01&tabs=viewOnlineTab&doc=SSU01007108656

hhy05 commented 11 years ago

A few more suppliers from cataloguing:

Berg Fashion - 74 titles, e.g. http://catalogue.solent.ac.uk/primo_library/libweb/action/display.do?fn=display&displayMode=full&vid=SSUVU01&tabs=viewOnlineTab&doc=SSU01009633593

CAB Direct - 105 titles http://catalogue.solent.ac.uk/primo_library/libweb/action/display.do?fn=display&displayMode=full&vid=SSUVU01&tabs=viewOnlineTab&doc=SSU01009650827

Hospitality and Tourism Complete - 65 titles http://catalogue.solent.ac.uk/primo_library/libweb/action/display.do?fn=display&displayMode=full&vid=SSUVU01&tabs=viewOnlineTab&doc=SSU01007553609

ostephens commented 11 years ago

The issue here is scaling up an approach. I can add in a database field for each of these, but if you wanted another one added in the future it would then be additional code. I could attempt to scrape the database name from the text that reads something like "Click here for more information about XXX" but it is hard to know how consistent this is - the Referex example you give above doesn't use this phrasing. Also, the phrasing doesn't always use the actual database name (in the Berg example it says 'Berg Ebooks' rather than 'Berg Fashion'

The other approach I can think of would be to see if there was a Metalib database link in the record and then use the Metalib API to look up the database name - think that might be a bit more reliable than scraping text, but it is hard to know and is more work.

So options:

hhy05 commented 11 years ago

I'd be inclined to go for the scrape text I think - I presume numbers of ebook suppliers will only increase so code each individually not really viable - Berg Ebooks would be ok rather than Berg Fashion, and then if we do change ebook records in the future, would only mean one change here? Metalib might be more problematic than it's worth I think. Agree that referex record is slightly different.

Will check with acq/cat what they think to this and best way forward....

hhy05 commented 11 years ago

Have checked with cataloguing, they agree scrape text is best - we can potentially change the referex records so they match as a bulk change - these are older and function in a slightly different way which is why they are different.

Question - are you planning on scraping using the following "Click here for more information about"... so should the database name always follow about... or should it always be at the end of the field? Either way is fine but just so we rephrase any records that need it appropriately.

hhy05 commented 11 years ago

Question from cataloguing:

will change the wording for the 856z field so that there is more consistency with our other databases for Referex

What I wondered though is how much consistency is needed, i.e. how much of the string of text Owen would need - would it have to be 'Click here for more information about XXX' or, if I changed the wording to, e.g. 'Off campus: Search for ebook and find more information about Referex' would the 'more information about' be sufficient?

I think we want to keep the 'search for ebook' in the text for now as well as pointing people to use the metalib link off campus.

ostephens commented 11 years ago

Basically the more consistency the better it is likely to work. I can check for any set of words within the text - but if there is more variation the more likely it becomes that I'll match something that wasn't intended.

The other thing to be wary of is including information after the database/service name - this is likely to cause more difficult problems.

So: "Off campus: Search for ebook and find more information about Referex" is fine but "Off campus: Search for ebook and find more information about Referex and other services" is not

This is because it means I don't know how many words after "more information about" make up the name of the service.

Based on the comments so far I'm happy to write code that looks for: "more information about" and assumes that all words following this make up the name of the service/database. Does that sound OK?

hhy05 commented 11 years ago

The referex ones have been amended so I hope are now suitable:

http://catalogue.solent.ac.uk/primo_library/libweb/action/search.do?dscnt=0&frbg=&tab=default_tab&dstmp=1368710067243&srt=rank&mode=Basic&dum=true&ct=search&indx=1&search_field=s&vl%28freeText0%29=The+pursuit+of+new+product+development+%3A+the+business+development+process+%2F+Marc+A.+Annacchino.&fn=search&vid=SSUVU01

The pursuit of new product development : the business development process
Annacchino, Marc A Amsterdam : Butterworth-Heinemann c2007

Available Resources

On campus: Click here to access this ebook
Off campus: Search for ebook and find more information about Referex
hhy05 commented 11 years ago

New supplier - EBSCO will need to be added at some point -wait for first cat records and then test. Cataloging asked to inform when a record arrives....

hhy05 commented 11 years ago

Owen - referex examples don't seem to be working and we have changed the catalogue records to match the requirements above so they now say find more information about referex at the end of the string? Can't get these ebooks to pull in database, but online etc ok - Berg/dawson/my-i-library ok so can't work out referex. Some example ISBN's 9780123695161 9780750679862

hhy05 commented 11 years ago

oh and sorry, we need online without the capital O please - as in the first message - thanks

ostephens commented 11 years ago

Both of these should be fixed in latest code on github

ssu-readinglists commented 11 years ago

Working on live for providers including referex and online lowercase 26/07/13