Closed Code-Slave closed 5 years ago
Could be done as a popup in config like the job_status button?
yes, unless you want to link in some way, say to a list of all books with no description etc
Basic stats popup added, most of the lists are available from the api already, I think,
Basic stats popup added, most of the lists are available from the api already, I think, Thanks, need to check those numbers to find out the optimal configs.
Cache stats and sleep times are only relevant on a libraryscan, show how efficient the caching is. The interesting bits are... 99 authors overdue = their cached data is older than config setting. If you run lazylibrarian 24/7 this should reduce to zero over time No ISBN and No Lang maybe don't matter, not all books have that info, but often if that data is missing we are missing other data too, probably incomplete info from goodreads. No Description is more of a puzzle. Seems goodreads don't always include that info in the authors list of books any more. I think they used to almost always have the info, now about 1 in 3 of my library is missing description. Need to look into what we can do about it. 7 blank series = 7 series where we have a series title but no list of books in the series 5 blank authors = same, we have the author details but no list of books for them Magazines you seem to have 1 issue, but no magazine titles and no empty titles, so it looks like there is a magazine issue left in the database where the parent was deleted?
Seems the missing book data is a terms-of-service issue, they can serve the data over html but not api, depends on where they sourced the info. It's described here... https://www.goodreads.com/topic/show/400976-book-blurb
Cache stats and sleep times are only relevant on a libraryscan, show how efficient the caching is. The interesting bits are... 99 authors overdue = their cached data is older than config setting. If you run lazylibrarian 24/7 this should reduce to zero over time No ISBN and No Lang maybe don't matter, not all books have that info, but often if that data is missing we are missing other data too, probably incomplete info from goodreads. No Description is more of a puzzle. Seems goodreads don't always include that info in the authors list of books any more. I think they used to almost always have the info, now about 1 in 3 of my library is missing description. Need to look into what we can do about it. 7 blank series = 7 series where we have a series title but no list of books in the series 5 blank authors = same, we have the author details but no list of books for them Magazines you seem to have 1 issue, but no magazine titles and no empty titles, so it looks like there is a magazine issue left in the database where the parent was deleted?
I had ticked "Increase delay for previously failed searches" in the config page few weeks ago, most probably thats why it shows overdue! I went back to Manage eBooks and changed the status of "Wanted" to "No Delay". Do you think this will reset the delay and looks up for those overdue in the next search job?
The "increase delay" setting is for books we searched for but couldn't find, ie books marked "Wanted" that the search tasks didn't find for download. If they couldn't find them today they are unlikely to find them tomorrow, so we increase the delay and only search for that book every Nth time. Details in Manage page, eg delay 3/5 means we only search every 5th time and we have skipped the last 3. If the book still isn't found when we next search, the 5 increases to 6 so the delay gets longer each time.
Resetting to "No Delay" will mean the book is searched for on the next run but is only temporary, so if the book isn't found you will try again with a small delay. If you turn "increase delay" off we don't check the counters and search every time.
The "Authors overdue" is where your cache expiry is say 30 days, and you have 99 authors whose book lists are older than this. We refresh one author every so often, depending on how many authors you have and the cache expiry. Details of who is next and when are in the config "Job Status" button
Seems the missing book data is a terms-of-service issue, they can serve the data over html but not api, depends on where they sourced the info. It's described here... https://www.goodreads.com/topic/show/400976-book-blurb
I tried using the API through the browser trying some of the books I have in LL without description. but had an empty description tag. I guess as you said its to do with TOS from 3rd parties. https://www.goodreads.com/book/show/20446068-quicklet-on-geoffrey-a-moore-s-crossing-the-chasm?key=XXXXXXXX&format=xml
https://www.goodreads.com/book/show/24964462.how-to-govern-anything?key=XXXXXXXXX&format=xml
Returned::::
<description/>
Is it possible to include the ability to modify the description similar to the cover when I click on the "manual" http://abcd:5299/editBook?bookid=12345?
Possible, I tried that a while ago but html formatting in the description was causing issues. We now have a popup that can display though, so might take another look at it.
Another option would be an api call to update description
Possible, I tried that a while ago but html formatting in the description was causing issues. We now have a popup that can display though, so might take another look at it.
Another option would be an api call to update description
Thanks for the continous support, really appreciate it. The popup only is for display, right? What about the API call to update the description, what command shall I use? I tired help but I cant narrow down to the exact command.
Yes the popup is only for display, but I can copy the way they edit the raw text, I think. There is no api call for this yet, but it would be easier to write an api call than an editor/popup, what do you think, would an api call be enough?
The API call will ideal only if description is available by Goodreads, I presume. Can the API use alternative sources such as LibraryThing using the provided API key to lookup description?
I see having a pop up with pagination for selected items in the library with missing details more ideal for fixing issues for 10s but not 100s of books. Perhaps something similar to how you currently provide an option to change cover using GR, Google isbn etc..
Needs thinking about. Have not found a good way of getting the missing info yet. We could page-scrape the html from goodreads but it's against their tos. Could maybe use librarything and google instead
Page scrapping is going to be painful to maintain considering that GUI changes are unavoidable, I did some webpage scrapping about 20 years ago and I understand the amount of efforts required to keep it functional. You can add the description fetched initially to the editbook popup http://abcd:5299/editBook?bookid=xxxxx and give the user an option to select the available description by clicking either Goodreads, Google ISBN and so on. Just similar to the way they manually update the covers. Ultimately, the end user can copy the description from somewhere else and paste it into the description section similar to how Calibre-Web enable users to modify the book info as part of the upload feature (https://github.com/janeczku/calibre-web/blob/master/cps/uploader.py)
I checked Google API call, there's one book that nither Google nor GR return their description using the URL below. Most of the books I checked return some JSON response with including the description. https://www.googleapis.com/books/v1/volumes?q=isbn:1614641420
That's very useful, thanks. I think I might just use the googleapis link for now, only if there is no description from goodreads. Maybe add an editor at a later date, but any description is better than none!
I agree about using diff api's. Manual is a perfect place to edit. This is where i was going with stats. Show me all books with no desc etc
Couple of problems. Googlebooks api keeps giving me 403: forbidden errors. Seem to get useful results for a while then they say "dailyLimitExceeded", limit is 1000 hits per day, so we will have to drip-feed the updates. Maybe only do it on a scheduled author refresh, and keep track of the failed state so we stop until the counter resets.
Other problem is we would need an editor widget for the manual edit page if we want to edit the book description. We only have a line editor at the moment and it doesn't like html which many page descriptions use.
We should be tracking the 403 errors now and not trying googlebooks until the daily quota is reset. It's a bit complicated as they reset them all at midnight pacific time so we have to calculate the time difference and I'm not sure if that's pacific time with/without dst ?
I have also added a basic editor widget for the book descriptions. You can cut/paste from other pages and include html
Thanks for the update, I can see the description editor and looks great.
In regard to Google API, the example I shared doesn't use a key so it might be using IP address filter to limit the daily requests.
As for the time/dst, you can use https://www.programmableweb.com/api/worldtime "API can also return information on whether a time zone is currently in Daylight Savings Time (DST), when DST starts and ends, and the UTC offset". You can store the requests count per date and check if date_last = date_current AND API_Requests <1000
Hope this may help...
If you don't give an api key google uses a much lower limit (ip based, maybe 100 per day, not sure) The error message says "Daily Limit Exceeded. The quota will be reset at midnight Pacific Time (PT)" but I don't know if that's PDT or PST, assume PST?
The current code just looks for the 403 error and blocks until next PST midnight. No point in counting requests as we might not be the only program calling google
Seems google also blocks you if you are behind a vpn and it can't determine your geolocation, but they let you specify country=US or whatever, so added a config option for that.
have a stats page off main menu or in config with counts of various things Book bookswanted books ignored books with no desc
authors authors ignored authors with no metadata authors with no books
same for audio
mags (maybe not much else as there isnt a lot of info for a mag)