k-int / gokb-phase1

Original GOKb repo - Moving to https://github.com/openlibraryenvironment/gokb
http://www.gokb.org
Other
11 stars 5 forks source link

Duplicate ISSN #527

Closed jhsolomon closed 8 years ago

jhsolomon commented 8 years ago

When I searched for 1545-7230 in the Global Search, there were two Journalinstances listed.

jhsolomon commented 8 years ago

I searched using the Title search for 1545-7230 and there are two identical title records, with title histories.

image

image

These are the two links, which are also identical: https://gokb-test.openlibraryfoundation.org/gokb/resource/show/org.gokb.cred.JournalInstance%3A10701875

https://gokb-test.openlibraryfoundation.org/gokb/resource/show/org.gokb.cred.JournalInstance%3A10701875

jhsolomon commented 8 years ago

I noted the same issue with 1545-5815. In both identical title records there are 4 identifiers, with an extra ISSN that matches the eISSN.

image

ianibo commented 8 years ago

OK - so these are not duplicates, the search is returning matches for each of the issn instances. If you look at the URLs for the records, you will note that they are the same URL - which means there is only one title - it's just that the search tells you about the title twice, because each identifier matched. We can add a "Distinct" clause to the search, but this can very badly increase the time taken to execute a search which returns many rows (It's not noticeable on a search returning 3 items, but on a search returning 20000 items, it takes a very long time to find all the unique items). Does this make sense as an explanation?

ianibo commented 8 years ago

(I should say - this is different to the situation on Live - where the 2 rows did have different identifiers - and hence were different items)

jhsolomon commented 8 years ago

But what about the ISSN that still matches the eISSN?

issnl 1042-9670 https://gokb-test.openlibraryfoundation.org/gokb/resource/show/org.gokb.cred.Identifier%3A10701869 Delete https://gokb-test.openlibraryfoundation.org/gokb/ajaxSupport/delete?__context=org.gokb.cred.Combo%3A10701877&fragment=identifiers issn 1042-9670 https://gokb-test.openlibraryfoundation.org/gokb/resource/show/org.gokb.cred.Identifier%3A10701871 Delete https://gokb-test.openlibraryfoundation.org/gokb/ajaxSupport/delete?__context=org.gokb.cred.Combo%3A10701882&fragment=identifiers issn 1545-7230 https://gokb-test.openlibraryfoundation.org/gokb/resource/show/org.gokb.cred.Identifier%3A10701873 Delete https://gokb-test.openlibraryfoundation.org/gokb/ajaxSupport/delete?__context=org.gokb.cred.Combo%3A10701886&fragment=identifiers eissn 1545-7230 https://gokb-test.openlibraryfoundation.org/gokb/resource/show/org.gokb.cred.Identifier%3A29661629

On Wed, Sep 14, 2016 at 12:48 PM, Ian Ibbotson notifications@github.com wrote:

(I should say - this is different to the situation on Live - where the 2 rows did have different identifiers - and hence were different items)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/k-int/gokb-phase1/issues/527#issuecomment-247077994, or mute the thread https://github.com/notifications/unsubscribe-auth/AMwAJXhs482F-yRhnOtNUwdQICVcNhHwks5qqCVigaJpZM4J8816 .

Jennifer Solomon GOKb Editor, Acquisitions and Discovery North Carolina State University Libraries 919-515-2743 j kristen_wilson@ncsu.eduhsolomo@ncsu.edu

jhsolomon commented 8 years ago

I don't want to slow down the search, but searching by Identifier makes sense, so I think it's important that it display accurately. Are there any other options?

On Wed, Sep 14, 2016 at 12:54 PM, Jennifer Solomon jhsolomo@ncsu.edu wrote:

But what about the ISSN that still matches the eISSN?

issnl 1042-9670 https://gokb-test.openlibraryfoundation.org/gokb/resource/show/org.gokb.cred.Identifier%3A10701869 Delete https://gokb-test.openlibraryfoundation.org/gokb/ajaxSupport/delete?__context=org.gokb.cred.Combo%3A10701877&fragment=identifiers issn 1042-9670 https://gokb-test.openlibraryfoundation.org/gokb/resource/show/org.gokb.cred.Identifier%3A10701871 Delete https://gokb-test.openlibraryfoundation.org/gokb/ajaxSupport/delete?__context=org.gokb.cred.Combo%3A10701882&fragment=identifiers issn 1545-7230 https://gokb-test.openlibraryfoundation.org/gokb/resource/show/org.gokb.cred.Identifier%3A10701873 Delete https://gokb-test.openlibraryfoundation.org/gokb/ajaxSupport/delete?__context=org.gokb.cred.Combo%3A10701886&fragment=identifiers eissn 1545-7230 https://gokb-test.openlibraryfoundation.org/gokb/resource/show/org.gokb.cred.Identifier%3A29661629

On Wed, Sep 14, 2016 at 12:48 PM, Ian Ibbotson notifications@github.com wrote:

(I should say - this is different to the situation on Live - where the 2 rows did have different identifiers - and hence were different items)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/k-int/gokb-phase1/issues/527#issuecomment-247077994, or mute the thread https://github.com/notifications/unsubscribe-auth/AMwAJXhs482F-yRhnOtNUwdQICVcNhHwks5qqCVigaJpZM4J8816 .

Jennifer Solomon GOKb Editor, Acquisitions and Discovery North Carolina State University Libraries 919-515-2743 j kristen_wilson@ncsu.eduhsolomo@ncsu.edu

Jennifer Solomon GOKb Editor, Acquisitions and Discovery North Carolina State University Libraries 919-515-2743 j kristen_wilson@ncsu.eduhsolomo@ncsu.edu

ianibo commented 8 years ago

YEah - under the admin menu there is now a housekeeping option - selecting that will cause the database to run a cleanup that should remove the ISSN/eISSN pairs.

jhsolomon commented 8 years ago

I clicked on it and it gave me a a Log Viewer. Is that correct/expected?

On Wed, Sep 14, 2016 at 12:55 PM, Ian Ibbotson notifications@github.com wrote:

YEah - under the admin menu there is now a housekeeping option - selecting that will cause the database to run a cleanup that should remove the ISSN/eISSN pairs.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/k-int/gokb-phase1/issues/527#issuecomment-247080314, or mute the thread https://github.com/notifications/unsubscribe-auth/AMwAJV2x7IWjuv3e5HflSB8pJz-nuGE9ks5qqCcHgaJpZM4J8816 .

Jennifer Solomon GOKb Editor, Acquisitions and Discovery North Carolina State University Libraries 919-515-2743 j kristen_wilson@ncsu.eduhsolomo@ncsu.edu

ianibo commented 8 years ago

I've not found a good alternative - one way to do it might be to have a "Return unique records" checkbox on the screen. An alternative might be to do the count query and if the result is less than a set number (Say 100) re-run the query with the distinct keyword and a message saying "Search only returning unique items". Generally, I'm nervous about quietly doing something without the user being in control of it... Any thoughts?

ianibbo commented 8 years ago

YEah - I don't like that, but steve thought it was good in testing as it lets you see if it blows up. You can safely navigate away from that screen and it will run in the background.

Ian Ibbotson Director Knowledge Integration Ltd 35 Paradise Street, Sheffield. S3 8PZ T: 0114 273 8271 M: 07968 794 630 W: http://www.k-int.com Doodle: http://doodle.com/ianibbo

On 14 September 2016 at 17:56, jhsolomon notifications@github.com wrote:

I clicked on it and it gave me a a Log Viewer. Is that correct/expected?

On Wed, Sep 14, 2016 at 12:55 PM, Ian Ibbotson notifications@github.com wrote:

YEah - under the admin menu there is now a housekeeping option - selecting that will cause the database to run a cleanup that should remove the ISSN/eISSN pairs.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <https://github.com/k-int/gokb-phase1/issues/527#issuecomment-247080314 , or mute the thread https://github.com/notifications/unsubscribe-auth/ AMwAJV2x7IWjuv3e5HflSB8pJz-nuGE9ks5qqCcHgaJpZM4J8816 .

Jennifer Solomon GOKb Editor, Acquisitions and Discovery North Carolina State University Libraries 919-515-2743 j kristen_wilson@ncsu.eduhsolomo@ncsu.edu

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/k-int/gokb-phase1/issues/527#issuecomment-247080806, or mute the thread https://github.com/notifications/unsubscribe-auth/AAfzdcT94vtkJVgxdlQw_x3bxSoLumsEks5qqCdZgaJpZM4J8816 .

jhsolomon commented 8 years ago

I like the idea of the checkbox on the screen. That way a user can search by identifier and if the results seem like duplicates, they can then check the box and run it again.

On Wed, Sep 14, 2016 at 12:58 PM, Ian Ibbotson notifications@github.com wrote:

I've not found a good alternative - one way to do it might be to have a "Return unique records" checkbox on the screen. An alternative might be to do the count query and if the result is less than a set number (Say 100) re-run the query with the distinct keyword and a message saying "Search only returning unique items". Generally, I'm nervous about quietly doing something without the user being in control of it... Any thoughts?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/k-int/gokb-phase1/issues/527#issuecomment-247081324, or mute the thread https://github.com/notifications/unsubscribe-auth/AMwAJYsonuOLUMnN8gaskTquc8GjWQtaks5qqCeygaJpZM4J8816 .

Jennifer Solomon GOKb Editor, Acquisitions and Discovery North Carolina State University Libraries 919-515-2743 j kristen_wilson@ncsu.eduhsolomo@ncsu.edu

jhsolomon commented 8 years ago

How does it take to run through the whole KB? Is this something one of us should do on a daily/weekly basis?

On Wed, Sep 14, 2016 at 1:02 PM, ianibbo notifications@github.com wrote:

YEah - I don't like that, but steve thought it was good in testing as it lets you see if it blows up. You can safely navigate away from that screen and it will run in the background.

Ian Ibbotson Director Knowledge Integration Ltd 35 Paradise Street, Sheffield. S3 8PZ T: 0114 273 8271 M: 07968 794 630 W: http://www.k-int.com Doodle: http://doodle.com/ianibbo

On 14 September 2016 at 17:56, jhsolomon notifications@github.com wrote:

I clicked on it and it gave me a a Log Viewer. Is that correct/expected?

On Wed, Sep 14, 2016 at 12:55 PM, Ian Ibbotson <notifications@github.com

wrote:

YEah - under the admin menu there is now a housekeeping option - selecting that will cause the database to run a cleanup that should remove the ISSN/eISSN pairs.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <https://github.com/k-int/gokb-phase1/issues/527# issuecomment-247080314 , or mute the thread https://github.com/notifications/unsubscribe-auth/ AMwAJV2x7IWjuv3e5HflSB8pJz-nuGE9ks5qqCcHgaJpZM4J8816 .

Jennifer Solomon GOKb Editor, Acquisitions and Discovery North Carolina State University Libraries 919-515-2743 j kristen_wilson@ncsu.eduhsolomo@ncsu.edu

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <https://github.com/k-int/gokb-phase1/issues/527#issuecomment-247080806 , or mute the thread https://github.com/notifications/unsubscribe-auth/AAfzdcT94vtkJVgxdlQw_ x3bxSoLumsEks5qqCdZgaJpZM4J8816 .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/k-int/gokb-phase1/issues/527#issuecomment-247082753, or mute the thread https://github.com/notifications/unsubscribe-auth/AMwAJfxqEsy2FRewUwdyWVPcS7vB95S3ks5qqCiogaJpZM4J8816 .

Jennifer Solomon GOKb Editor, Acquisitions and Discovery North Carolina State University Libraries 919-515-2743 j kristen_wilson@ncsu.eduhsolomo@ncsu.edu

ianibbo commented 8 years ago

It's actually pretty quick - minutes at most. No harm in running it on a weekly basis.

Ian Ibbotson Director Knowledge Integration Ltd 35 Paradise Street, Sheffield. S3 8PZ T: 0114 273 8271 M: 07968 794 630 W: http://www.k-int.com Doodle: http://doodle.com/ianibbo

On 14 September 2016 at 18:05, jhsolomon notifications@github.com wrote:

How does it take to run through the whole KB? Is this something one of us should do on a daily/weekly basis?

On Wed, Sep 14, 2016 at 1:02 PM, ianibbo notifications@github.com wrote:

YEah - I don't like that, but steve thought it was good in testing as it lets you see if it blows up. You can safely navigate away from that screen and it will run in the background.

Ian Ibbotson Director Knowledge Integration Ltd 35 Paradise Street, Sheffield. S3 8PZ T: 0114 273 8271 M: 07968 794 630 W: http://www.k-int.com Doodle: http://doodle.com/ianibbo

On 14 September 2016 at 17:56, jhsolomon notifications@github.com wrote:

I clicked on it and it gave me a a Log Viewer. Is that correct/expected?

On Wed, Sep 14, 2016 at 12:55 PM, Ian Ibbotson < notifications@github.com

wrote:

YEah - under the admin menu there is now a housekeeping option - selecting that will cause the database to run a cleanup that should remove the ISSN/eISSN pairs.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <https://github.com/k-int/gokb-phase1/issues/527# issuecomment-247080314 , or mute the thread https://github.com/notifications/unsubscribe-auth/ AMwAJV2x7IWjuv3e5HflSB8pJz-nuGE9ks5qqCcHgaJpZM4J8816 .

Jennifer Solomon GOKb Editor, Acquisitions and Discovery North Carolina State University Libraries 919-515-2743 j kristen_wilson@ncsu.eduhsolomo@ncsu.edu

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <https://github.com/k-int/gokb-phase1/issues/527# issuecomment-247080806 , or mute the thread https://github.com/notifications/unsubscribe- auth/AAfzdcT94vtkJVgxdlQw_ x3bxSoLumsEks5qqCdZgaJpZM4J8816 .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <https://github.com/k-int/gokb-phase1/issues/527#issuecomment-247082753 , or mute the thread https://github.com/notifications/unsubscribe-auth/ AMwAJfxqEsy2FRewUwdyWVPcS7vB95S3ks5qqCiogaJpZM4J8816 .

Jennifer Solomon GOKb Editor, Acquisitions and Discovery North Carolina State University Libraries 919-515-2743 j kristen_wilson@ncsu.eduhsolomo@ncsu.edu

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/k-int/gokb-phase1/issues/527#issuecomment-247083694, or mute the thread https://github.com/notifications/unsubscribe-auth/AAfzdZ-83v33Tduj3R_VSZuO1f8sx2VEks5qqClZgaJpZM4J8816 .

jhsolomon commented 8 years ago

This is very useful! Since I ran it, I have not have the issue with searching by identifier and getting back multiple results.

On Wed, Sep 14, 2016 at 1:17 PM, ianibbo notifications@github.com wrote:

It's actually pretty quick - minutes at most. No harm in running it on a weekly basis.

Ian Ibbotson Director Knowledge Integration Ltd 35 Paradise Street, Sheffield. S3 8PZ T: 0114 273 8271 M: 07968 794 630 W: http://www.k-int.com Doodle: http://doodle.com/ianibbo

On 14 September 2016 at 18:05, jhsolomon notifications@github.com wrote:

How does it take to run through the whole KB? Is this something one of us should do on a daily/weekly basis?

On Wed, Sep 14, 2016 at 1:02 PM, ianibbo notifications@github.com wrote:

YEah - I don't like that, but steve thought it was good in testing as it lets you see if it blows up. You can safely navigate away from that screen and it will run in the background.

Ian Ibbotson Director Knowledge Integration Ltd 35 Paradise Street, Sheffield. S3 8PZ T: 0114 273 8271 M: 07968 794 630 W: http://www.k-int.com Doodle: http://doodle.com/ianibbo

On 14 September 2016 at 17:56, jhsolomon notifications@github.com wrote:

I clicked on it and it gave me a a Log Viewer. Is that correct/expected?

On Wed, Sep 14, 2016 at 12:55 PM, Ian Ibbotson < notifications@github.com

wrote:

YEah - under the admin menu there is now a housekeeping option - selecting that will cause the database to run a cleanup that should remove the ISSN/eISSN pairs.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <https://github.com/k-int/gokb-phase1/issues/527# issuecomment-247080314 , or mute the thread https://github.com/notifications/unsubscribe-auth/ AMwAJV2x7IWjuv3e5HflSB8pJz-nuGE9ks5qqCcHgaJpZM4J8816 .

Jennifer Solomon GOKb Editor, Acquisitions and Discovery North Carolina State University Libraries 919-515-2743 j kristen_wilson@ncsu.eduhsolomo@ncsu.edu

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <https://github.com/k-int/gokb-phase1/issues/527# issuecomment-247080806 , or mute the thread https://github.com/notifications/unsubscribe- auth/AAfzdcT94vtkJVgxdlQw_ x3bxSoLumsEks5qqCdZgaJpZM4J8816 .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <https://github.com/k-int/gokb-phase1/issues/527# issuecomment-247082753 , or mute the thread https://github.com/notifications/unsubscribe-auth/ AMwAJfxqEsy2FRewUwdyWVPcS7vB95S3ks5qqCiogaJpZM4J8816 .

Jennifer Solomon GOKb Editor, Acquisitions and Discovery North Carolina State University Libraries 919-515-2743 j kristen_wilson@ncsu.eduhsolomo@ncsu.edu

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <https://github.com/k-int/gokb-phase1/issues/527#issuecomment-247083694 , or mute the thread https://github.com/notifications/unsubscribe-auth/AAfzdZ-83v33Tduj3R_ VSZuO1f8sx2VEks5qqClZgaJpZM4J8816

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/k-int/gokb-phase1/issues/527#issuecomment-247087586, or mute the thread https://github.com/notifications/unsubscribe-auth/AMwAJWvCZ9Ta5b7MEEqTlcC69VCcXSowks5qqCwNgaJpZM4J8816 .

Jennifer Solomon GOKb Editor, Acquisitions and Discovery North Carolina State University Libraries 919-515-2743 j kristen_wilson@ncsu.eduhsolomo@ncsu.edu

ianibbo commented 8 years ago

Awesome :)

Ian Ibbotson Director Knowledge Integration Ltd 35 Paradise Street, Sheffield. S3 8PZ T: 0114 273 8271 M: 07968 794 630 W: http://www.k-int.com Doodle: http://doodle.com/ianibbo

On 14 September 2016 at 18:21, jhsolomon notifications@github.com wrote:

This is very useful! Since I ran it, I have not have the issue with searching by identifier and getting back multiple results.

On Wed, Sep 14, 2016 at 1:17 PM, ianibbo notifications@github.com wrote:

It's actually pretty quick - minutes at most. No harm in running it on a weekly basis.

Ian Ibbotson Director Knowledge Integration Ltd 35 Paradise Street, Sheffield. S3 8PZ T: 0114 273 8271 M: 07968 794 630 W: http://www.k-int.com Doodle: http://doodle.com/ianibbo

On 14 September 2016 at 18:05, jhsolomon notifications@github.com wrote:

How does it take to run through the whole KB? Is this something one of us should do on a daily/weekly basis?

On Wed, Sep 14, 2016 at 1:02 PM, ianibbo notifications@github.com wrote:

YEah - I don't like that, but steve thought it was good in testing as it lets you see if it blows up. You can safely navigate away from that screen and it will run in the background.

Ian Ibbotson Director Knowledge Integration Ltd 35 Paradise Street, Sheffield. S3 8PZ T: 0114 273 8271 M: 07968 794 630 W: http://www.k-int.com Doodle: http://doodle.com/ianibbo

On 14 September 2016 at 17:56, jhsolomon notifications@github.com wrote:

I clicked on it and it gave me a a Log Viewer. Is that correct/expected?

On Wed, Sep 14, 2016 at 12:55 PM, Ian Ibbotson < notifications@github.com

wrote:

YEah - under the admin menu there is now a housekeeping option - selecting that will cause the database to run a cleanup that should remove the ISSN/eISSN pairs.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <https://github.com/k-int/gokb-phase1/issues/527# issuecomment-247080314 , or mute the thread https://github.com/notifications/unsubscribe-auth/ AMwAJV2x7IWjuv3e5HflSB8pJz-nuGE9ks5qqCcHgaJpZM4J8816 .

Jennifer Solomon GOKb Editor, Acquisitions and Discovery North Carolina State University Libraries 919-515-2743 j kristen_wilson@ncsu.eduhsolomo@ncsu.edu

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <https://github.com/k-int/gokb-phase1/issues/527# issuecomment-247080806 , or mute the thread https://github.com/notifications/unsubscribe- auth/AAfzdcT94vtkJVgxdlQw_ x3bxSoLumsEks5qqCdZgaJpZM4J8816 .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <https://github.com/k-int/gokb-phase1/issues/527# issuecomment-247082753 , or mute the thread https://github.com/notifications/unsubscribe-auth/ AMwAJfxqEsy2FRewUwdyWVPcS7vB95S3ks5qqCiogaJpZM4J8816 .

Jennifer Solomon GOKb Editor, Acquisitions and Discovery North Carolina State University Libraries 919-515-2743 j kristen_wilson@ncsu.eduhsolomo@ncsu.edu

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <https://github.com/k-int/gokb-phase1/issues/527# issuecomment-247083694 , or mute the thread https://github.com/notifications/unsubscribe-auth/AAfzdZ-83v33Tduj3R_ VSZuO1f8sx2VEks5qqClZgaJpZM4J8816

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <https://github.com/k-int/gokb-phase1/issues/527#issuecomment-247087586 , or mute the thread https://github.com/notifications/unsubscribe-auth/ AMwAJWvCZ9Ta5b7MEEqTlcC69VCcXSowks5qqCwNgaJpZM4J8816 .

Jennifer Solomon GOKb Editor, Acquisitions and Discovery North Carolina State University Libraries 919-515-2743 j kristen_wilson@ncsu.eduhsolomo@ncsu.edu

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/k-int/gokb-phase1/issues/527#issuecomment-247088827, or mute the thread https://github.com/notifications/unsubscribe-auth/AAfzdbPLKdwNjCo62yO9Zo5yHYx9ZRD5ks5qqCz_gaJpZM4J8816 .

sosguthorpe commented 8 years ago

Is there anything left to do on this one @jhsolomon? If you are happy, please can you close it?

jhsolomon commented 8 years ago

Fix complete