DobyTang / LazyLibrarian

This project isn't finished yet. Goal is to create a SickBeard, CouchPotato, Headphones-like application for ebooks. Headphones is used as a base, so there are still a lot of references to it.
728 stars 72 forks source link

grsync line 575 #1543

Closed WillowMist closed 5 years ago

WillowMist commented 6 years ago

I'm getting a situation where I'm getting hourly logs with a lot of warning line entries from grsync.py line 575:

Not removing XXXXX, book is marked Open

This appears to be checking the local database to change Have and Wanted books to a Skipped status, but it's creating a lot of logger lines on books that are already there.

WillowMist commented 6 years ago

Sorry, running the current philborman build

WillowMist commented 6 years ago

I'm also a little unclear on what it's trying to do there. Is it removing books from the wanted list in LL at that point? And and what prompting, exactly?

Thanks

WillowMist commented 6 years ago

Ok, now I see what it's doing, but I'm unclear why it's processing so many books. Those shelves haven't changed in months.

philborman commented 6 years ago

That message is when a bookid that was in your read/to-read list at goodreads isn't there any more. Just a guess, but sometimes goodreads issues new bookids for existing books, eg they might have multiple entries for a book and merge them together, or there was an error in the author name or title and correcting that changed the bookid. If you had that bookid in a wanted or to-read list they will either remove it from your list as it's not a valid bookid any more, or possibly change the bookid in your list to the new bookid. There have been a lot of bookid changes recently. We have some code to check and correct bookids when each author is refreshed in lazylibrarian but no check in the goodreads sync routine. Easy way to check is to look at the bookid you have for one of the books logged, and then search goodreads for the book title and see if their bookid is different. You could also try refreshing the author details for one of the authors whose books are being logged. If you have debug logging on it will print how many books had their bookid updated at goodreads.

WillowMist commented 6 years ago

That makes sense. But shouldn't it only error on them once, if it's replacing the sync list? This is generating hundreds of entries, the same ones, every hour.

philborman commented 6 years ago

I think until the author(s) get refreshed the bookids remain wrong as we don't want to delete the books because we have them (marked as open) and we don't yet know the new bookids as we haven't rescanned the authors. Could be quite time consuming to rescan each author before doing a goodreads sync. If that is what the problem is we can work on how to optimize it, but would be useful to know if that is actually the case as it's only a guess at this stage

philborman commented 6 years ago

Had another look at that section of code. The message is saying "this book used to be on the goodreads "read" shelf, but isn't any more, and we think it's still in the lazylibrarian library" It will continue to throw a warning message on every sync until we do something about it, the problem is we don't know what to do automatically...

  1. If goodreads changed the bookid causing a mis-match we should update to the new bookid This will happen eventually on an author scan, or should we try to do it sooner
  2. If you deleted the book manually from goodreads a) If it's no longer in the lazylibrarian library we should change "Open" to "Skipped" or maybe "Ignored" b) If the book is still in the lazylibrarian library should we put it back onto the goodreads shelf?
WillowMist commented 6 years ago

Ok, so I picked the last book to throw an error. I've verified that it's NOT on the goodreads shelf, but I've not removed anything from it. It's in the LL library, and the file does exist. So.... Why didn't it make it onto the goodreads shelf? Oh, and I verified that the book id hasn't changed on goodreads. My "have" shelf on goodreads is a good deal smaller than my collection. Have I missed a step in getting the LL content synced with goodreads?

philborman commented 6 years ago

Doesn't sound like you're missing a step, and at some point the book was on the goodreads shelf as LL is complaining it's not there any more.

Might be worth trying a sync with debug level 1024, that turns on verbose debugging for the sync process. Once the sync is complete you can filter the log on the book number and see what happened. Maybe there is a page read error or a parsing error, if so it should show in the log. You will also get messages showing how many books were on each shelf, how many pages of books loaded etc, might show something odd in the numbers you are expecting, eg... There are 1872 Open/Have books, 1872 books on goodreads lazylibrarian_have shelf Found 72 books on page 19 (total = 1872) There are 13 Wanted books, 13 books on goodreads lazylibrarian_wanted shelf Shelf lazylibrarian_wanted : 13: Exclusive false Shelf lazylibrarian_have : 1872: Exclusive false

WillowMist commented 6 years ago

Will do. Is there a way to trigger the sync manually?

philborman commented 6 years ago

yes, there is a button on the "Manage" page

WillowMist commented 6 years ago

22-Aug-2018 11:32:43 - DEBUG :: GRSync : grsync.py:get_gr_shelf_contents:319 : Found 2623 22-Aug-2018 11:32:43 - INFO :: GRSync : grsync.py:grsync:512 : There are 2623 Wanted books, 2623 books on goodreads digital-wanted shelf 22-Aug-2018 11:32:43 - INFO :: GRSync : grsync.py:grsync:541 : 0 missing from lazylibrarian digital-wanted 22-Aug-2018 11:32:43 - INFO :: GRSync : grsync.py:grsync:556 : 0 missing from goodreads digital-wanted 22-Aug-2018 11:32:43 - INFO :: GRSync : grsync.py:grsync:578 : 0 new in lazylibrarian digital-wanted 22-Aug-2018 11:32:43 - INFO :: GRSync : grsync.py:grsync:593 : 0 new in goodreads digital-wanted 22-Aug-2018 11:32:44 - DEBUG :: GRSync : grsync.py:grsync:649 : Sync Wanted to digital-wanted shelf complete 22-Aug-2018 11:32:44 - INFO :: GRSync : grsync.py:grsync:480 : Syncing Open to digital-other shelf 22-Aug-2018 11:32:44 - DEBUG :: GRSync : common.py:gr_api_sleep:167 : GoodReads sleep 0.068, total 795.360 22-Aug-2018 11:32:44 - DEBUG :: GRSync : common.py:gr_api_sleep:167 : GoodReads sleep 0.549, total 795.909 22-Aug-2018 11:32:45 - DEBUG :: GRSync : common.py:gr_api_sleep:167 : GoodReads sleep 0.270, total 796.179 22-Aug-2018 11:32:45 - DEBUG :: GRSync : grsync.py:get_shelf_list:192 : Found 5 shelves on 1 page 22-Aug-2018 11:32:45 - DEBUG :: GRSync : common.py:gr_api_sleep:167 : GoodReads sleep 0.575, total 796.753 22-Aug-2018 11:32:46 - DEBUG :: GRSync : grsync.py:get_gr_shelf_contents:288 : User id is: xxxxxxxx 22-Aug-2018 11:32:46 - DEBUG :: GRSync : common.py:gr_api_sleep:167 : GoodReads sleep 0.140, total 796.893 22-Aug-2018 11:33:22 - DEBUG :: GRSync : grsync.py:get_gr_shelf_contents:319 : Found 577 22-Aug-2018 11:33:22 - INFO :: GRSync : grsync.py:grsync:512 : There are 3965 Open/Have books, 577 books on goodreads digital-other shelf 22-Aug-2018 11:33:22 - INFO :: GRSync : grsync.py:grsync:541 : 0 missing from lazylibrarian digital-other 22-Aug-2018 11:33:22 - INFO :: GRSync : grsync.py:grsync:556 : 3388 missing from goodreads digital-other

And then it's a mile of "Not removing" entries :) And finally...

22-Aug-2018 11:33:37 - INFO :: GRSync : grsync.py:grsync:578 : 0 new in lazylibrarian digital-other 22-Aug-2018 11:33:37 - INFO :: GRSync : grsync.py:grsync:593 : 0 new in goodreads digital-other 22-Aug-2018 11:33:37 - DEBUG :: GRSync : grsync.py:grsync:649 : Sync Open to digital-other shelf complete 22-Aug-2018 11:33:37 - INFO :: GRSync : grsync.py:sync_to_gr:442 : 0 changes to digital-wanted shelf, 0 changes to Wanted from GoodReads, 0 changes to digital-other shelf, 0 changes to Owned from GoodReads

philborman commented 6 years ago

ok thanks, need to think about this. Couple of easy ways to fix it, but puzzled how it came about.

philborman commented 6 years ago

Oops, forgot to give the fixes...

  1. use sqlite3 to "remove from sync where label='digital-other' " so lazylibrarian thinks the shelf was empty last time it synced, or
  2. use a different name for the digital-other shelf so lazylibrarian syncs to a new empty shelf, then delete the old shelf from goodreads
philborman commented 5 years ago

Added code to remove stored details on new shelf