DobyTang / LazyLibrarian

This project isn't finished yet. Goal is to create a SickBeard, CouchPotato, Headphones-like application for ebooks. Headphones is used as a base, so there are still a lot of references to it.
728 stars 72 forks source link

Series display suggestion #1530

Closed alagave closed 5 years ago

alagave commented 6 years ago

To help with identifying and fixing issues, please include as much information as possible, including:

LazyLibrarian version number (at the bottom of config page)

Operating system used (windows, mac, linux, NAS type)

Interface in use (default, bookstrap)

Which api (Goodreads, GoogleBooks, both)

Source of your LazyLibrarian installation (git, zip file, 3rd party package)

Relevant debug log with api keys and any passwords redacted

Please note - usually a single line of log is not sufficient. The lines just before the error occurs can give useful context and greatly assist with debugging.

There is a built-in debug log creator on the logs page which makes it easy to provide this information

alagave commented 6 years ago

I admit I'm new to this software and am still learning. Really impressed but I have a few questions which could lead to suggestions.

Why do books show up in Series display with 1/1 showing? I suppose it is possible for this to happen when an author says this is the first in a series but I see far too many for that to be the case.

This screen capture shows that every book in my collection plus those discovered by LL processing show up in the series display. Clearly something is amiss. (https://user-images.githubusercontent.com/41161338/43659718-74609c36-9722-11e8-96d6-54f2a9f9336a.jpg)

This screen capture show how the sort order works. I would expect it to either sort based on how many I have or the number of book in the series - but it doesn't. (https://user-images.githubusercontent.com/41161338/43660030-6b6652f0-9723-11e8-8f9a-e361788415ac.jpg) (https://user-images.githubusercontent.com/41161338/43659957-32ec9786-9723-11e8-8887-6ebd074a21e4.jpg)

Now, if having 17K+ items to display is normal, there needs to be a better display mechanism. I paged it at 100/page and there is still 175 pages! The page selection allows first, last and a few in one direction. I suggest either first, middle, last selected, last page with the common previous and next selection buttons. You may notice the selection buttons suggested allow for a binary search - dividing the mass in half until the desired page is found. Another field to enter the desired page number will allow this type of search as well. [I know it is not a real search but what else can you call it]

And all this investigation leads me to want to categorize my series into genres and priorities. Those priorities may be my desire for completion or LL suggesting that a series collection is almost complete. I think it has already been noted elsewhere that LL knows, and thus can display differently, authors and hopefully series, that are still active and those that are finished. I know not as easy with series as dead authors but book reviews normally lament the end of a series.

I also notice that travel authors show up in their own series instead of the travel book series they are writing for. For example Lonely Planet, Rough Guides, Frommer's and Fodor's all use a multitude of authors for their books but they all proudly have their name on the cover while the authors is buried on the catalog page. Not sure how to handle this issue except a special condition for travel guides (and whatever other books have this paradigm). I guess this is a metaindexing issue.

Running on QNAP 873E, installed QNAP pkg but now running GIT versions. Goodbooks under default interface. Repo: https://github.com/dobytang/lazylibrarian : Branch: master : Updated: Fri Jul 20 12:14:14 2018 Current Version: c09de92d55e2421325c003e0c79606e7c7fbcf4b : Latest Version: feb1d96ba33fe8c49a7ba99dba3276d53625d44f --- and yes, I know i'm behind. It's currently importing, on number 17, 458 or who knows how many.

philborman commented 6 years ago

Wow that's some collection! Haven't tried it with so many books, so haven't come across some of these issues.

Your first screenshot querying the books in series display shows 3 series, Nancy Drew, Hardy Boys, The Shadow and the numbers look about right, they are all series with huge numbers of books listed at goodreads. You seem to have ogoodreads doesn't even include them in their api, ne book in the "Nancy Drew" series and one in "The Shadow" I wouldn't expect to see many 1/1 series, maybe something amiss as you say, but don't see it here.

The sort order for series is how near to completion you are (sort of) as a percentage so that you can see series nearing completion, but there is an extra sort for zeros, so having none of a 100 book series is seen as being further from completion that none of a 3 book series.

The page buttons are generated by the table drawing library (datatables) not sure how much control we have over this (if any), will have a look.

Genres are difficult as goodreads doesn't include them in their api, no plans to add anything there at the moment.

The "Active" status in authors,series etc are not that the series is active, just that lazylibrarian is actively looking for them. This means we will periodically scan for new books written by the author or added to the series. If we find any we will add them to the database. "Wanted" status means we will also try to download them, "Skipped" means we don't care, won't look. Useful for authors who only have one series you are interested in, just look for that series and skip everything else they write.

Not a great fan of special cases for genres, eg travel books, as we generally don't know the genres anyway. And it's particularly tricky with travel books like Lonely Planet, Rough Guides etc where the first guide was written by author A, then author B wrote the 2010 update, and the 2015 one was by authors C, D and E. Effectively they are different books, so should probably be listed that way, but we are reliant on the metadata we get from goodreads or googlebooks.

Series with dead authors are a different issue. Do you keep the series going, or do you say the later books were by a different author so should be a new series, or marked as "based on characters by"? Goodreads uses different methods for different series, no consistency it seems.

alagave commented 6 years ago

Thanks for the response. I see the logic in "nearness to completion" but will need to see what my dataset looks like after the scan is completed. Perhaps consider a way to limit display choices of "partially completed", "fully completed", and "not started". Or maybe I'm a special case with so many entries.

I'll let you know about the 1/1 cases after i update.

I see the logic concerning special cases of genre but perhaps this is a special case of a collection or a meta-collection. I just realized the Star Trek and Star Wars books would fall into that category - what appears to be a series written by multiple authors. I have not looked closely at Goodreads or Google API but I'm guessing the concept of a collection rather than a series is lacking?

As for the issue of dead authors, that issue along with that of "Writing As" or series where the author's name in the title is not the same as the actual author (Tom Clancy), the real issue seems to be how much to override information provided by an API. We deal with such alias and legal name change issues in the financial and legal world by creating our own translation tables. And, or course, no two companies or agencies do it the same. Tough issue but it will return. Seems like books and CDs have more in common than CDs and video as far as cataloging goes. Lidar faces similar issues with bands that were renamed or even artists that changed names.

After building and using some user fields in Calibre to assist in collection definition, I am probably trying too hard to leverage those hours spent bringing order to chaos.

Thanks again

philborman commented 6 years ago

Current version has "empty" (not started) and "non-empty" (have at least one, so partially completed) as series filters, could easily add "full" to complete the set, but this does rely in part on the user noticing sets/collections that aren't obviously sets. We try to recognise things like "books 1 to 3" and ignore them, but it's not perfect, and we might think the set is not full because you don't have an omnibus edition though you have all of the individual books.

You are right that goodreads api is quite limited in genre/collection concepts, googlebooks is even worse. There is some series info there, but only if you know in advance exactly what they called the series, no search available, only "list members of series xyz"

Dead authors, writing as, Tom Clancy, the Bourne series, Star Wars etc are a different can of worms, and although translation tables are a possible solution they would need heavy user intervention, and I think are beyond my programming skills, I'm just a hobbyist and fairly new to python. Would be neat if one of the providers out there did it for us but it doesn't look likely. Librarything WhatWork api was quite useful for some of this until they shut it down because of overuse/abuse (hope it wasn't us!)

philborman commented 6 years ago

Re the large number of pages to display, are you aware of the results filter box, top right, that will let you rapidly zoom in on an author or series rather than paging through - as long as you know roughly what you're looking for.

alagave commented 6 years ago

Yes. I was just browsing to get an idea of the completeness of the many series.

On August 4, 2018 11:41:56 AM philborman notifications@github.com wrote:

Re the large number of pages to display, are you aware of the results filter box, top right, that will let you rapidly zoom in on an author or series rather than paging through - as long as you know roughly what you're looking for. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.