DobyTang / LazyLibrarian

This project isn't finished yet. Goal is to create a SickBeard, CouchPotato, Headphones-like application for ebooks. Headphones is used as a base, so there are still a lot of references to it.
731 stars 70 forks source link

Grabs wrong book when searching (correct author though) #265

Closed ayefly closed 8 years ago

ayefly commented 8 years ago

When searching for an ebook, it may or may not get the correct one. For example, when searching for Eric Flint 1633 novels, it usually just finds one, and uses that for EVERY book in the series. error log below, in it you see it searches for 'Eric Flint 1635 A Parcel of Rogues', but matches with 100% to Eric Flint, 1633. then renames the downloaded files to 1633. I know real book is online, because I can download it manually form nzbgeek.

2016-01-25 15:28:19 INFO Status set to "Wanted" for "1635 A Parcel of Rogues"

2016-01-25 15:28:19 INFO NZB Searching for one book

2016-01-25 15:28:19 DEBUG [IterateOverNewzNabSites] - Newznab0

2016-01-25 15:28:19 DEBUG [NewzNabPlus] searchType [book] with Host [https://api.nzbgeek.info] mode [nzb] using api [API REDACTED] for item [{'searchterm': 'Eric Flint 1635 A Parcel of Rogues', 'bookName': u'1635 A Parcel of Rogues', 'authorName': u'Eric Flint', 'bookid': u'25110899'}]

2016-01-25 15:28:19 DEBUG [NewzNabPlus] - nzb Search parameters set to {'author': u'Eric Flint', 'apikey': 'REDACTED', 't': 'book', 'cat': 7020, 'title': u'1635 A Parcel of Rogues'}

2016-01-25 15:28:21 DEBUG Parsing results from https://api.nzbgeek.info

2016-01-25 15:28:21 DEBUG [NewzNabPlus] - result fields from NZB are {'bookid': u'25110899', 'nzbmode': 'nzb', 'nzbdate': 'Tue, 20 Oct 2015 13:50:49 +0000', 'nzbtitle': 'Eric Flint - [Trails of Glory 01] - 1812- The Rivers of War (retail) (epub)', 'nzbsize': None, 'nzburl': 'REDACTED URL', 'nzbprov': 'https://api.nzbgeek.info'}

2016-01-25 15:28:21 DEBUG [NewzNabPlus] - result fields from NZB are {'bookid': u'25110899', 'nzbmode': 'nzb', 'nzbdate': 'Tue, 20 Oct 2015 13:50:50 +0000', 'nzbtitle': 'Eric Flint - [Trails of Glory 02] - 1824- The Arkansas War (retail) (epub)', 'nzbsize': None, 'nzburl': 'REDACTED URL', 'nzbprov': 'https://api.nzbgeek.info'}

2016-01-25 15:28:21 DEBUG [NewzNabPlus] - result fields from NZB are {'bookid': u'25110899', 'nzbmode': 'nzb', 'nzbdate': 'Tue, 08 Sep 2015 00:38:24 +0000', 'nzbtitle': 'Eric Flint - [Assiti Shards - Ring of Fire Press] - Second Chance Bird - Garrett W Vance (epub)', 'nzbsize': None, 'nzburl': 'REDACTED URL', 'nzbprov': 'https://api.nzbgeek.info'}

2016-01-25 15:28:21 DEBUG [NewzNabPlus] - result fields from NZB are {'bookid': u'25110899', 'nzbmode': 'nzb', 'nzbdate': 'Sun, 31 May 2015 23:38:02 +0000', 'nzbtitle': 'Flint, Eric - Ring of Fire 07 - 1635 - The Cannon Law (v5.0)', 'nzbsize': None, 'nzburl': 'REDACTED URL', 'nzbprov': 'https://api.nzbgeek.info'}

2016-01-25 15:28:21 DEBUG [NewzNabPlus] - result fields from NZB are {'bookid': u'25110899', 'nzbmode': 'nzb', 'nzbdate': 'Sun, 31 May 2015 23:38:02 +0000', 'nzbtitle': 'Flint, Eric - Ring of Fire 06 - 1634 - The Ram Rebellion [& Virginia DeMarce]', 'nzbsize': None, 'nzburl': 'REDACTED URL', 'nzbprov': 'https://api.nzbgeek.info'}

2016-01-25 15:28:21 DEBUG [NewzNabPlus] - result fields from NZB are {'bookid': u'25110899', 'nzbmode': 'nzb', 'nzbdate': 'Sun, 31 May 2015 23:38:02 +0000', 'nzbtitle': 'Flint, Eric - Ring of Fire 04 - 1634 - The Galileo Affair [& Dennis, Andrew]', 'nzbsize': None, 'nzburl': 'REDACTED URL', 'nzbprov': 'https://api.nzbgeek.info'}

2016-01-25 15:28:21 DEBUG [NewzNabPlus] - result fields from NZB are {'bookid': u'25110899', 'nzbmode': 'nzb', 'nzbdate': 'Sun, 09 Nov 2014 19:04:29 +0000', 'nzbtitle': 'Flint, Eric & Spoor, Ryk E - Boundary 02 - Threshold [epub]', 'nzbsize': None, 'nzburl': 'REDACTED URL', 'nzbprov': 'https://api.nzbgeek.info'}

2016-01-25 15:28:21 DEBUG [NewzNabPlus] - result fields from NZB are {'bookid': u'25110899', 'nzbmode': 'nzb', 'nzbdate': 'Sun, 09 Nov 2014 19:04:29 +0000', 'nzbtitle': 'Flint, Eric & Spoor, Ryk E - Boundary 01 - Boundary [epub]', 'nzbsize': None, 'nzburl': 'REDACTED URL', 'nzbprov': 'https://api.nzbgeek.info'}

2016-01-25 15:28:21 DEBUG [NewzNabPlus] - result fields from NZB are {'bookid': u'25110899', 'nzbmode': 'nzb', 'nzbdate': 'Thu, 20 Jun 2013 22:41:45 +0000', 'nzbtitle': '1632, Second Edition - Eric Flint', 'nzbsize': None, 'nzburl': 'REDACTED URL', 'nzbprov': 'https://api.nzbgeek.info'}

2016-01-25 15:28:21 DEBUG [NewzNabPlus] - result fields from NZB are {'bookid': u'25110899', 'nzbmode': 'nzb', 'nzbdate': 'Thu, 20 Jun 2013 22:41:45 +0000', 'nzbtitle': 'Ring of Fire II - Eric Flint', 'nzbsize': None, 'nzburl': 'REDACTED URL', 'nzbprov': 'https://api.nzbgeek.info'}

2016-01-25 15:28:21 DEBUG [NewzNabPlus] - result fields from NZB are {'bookid': u'25110899', 'nzbmode': 'nzb', 'nzbdate': 'Thu, 20 Jun 2013 22:41:45 +0000', 'nzbtitle': 'Ring of Fire - Eric Flint', 'nzbsize': None, 'nzburl': 'REDACTED URL', 'nzbprov': 'https://api.nzbgeek.info'}

2016-01-25 15:28:21 DEBUG [NewzNabPlus] - result fields from NZB are {'bookid': u'25110899', 'nzbmode': 'nzb', 'nzbdate': 'Thu, 20 Jun 2013 22:41:45 +0000', 'nzbtitle': '1633 - Eric Flint', 'nzbsize': None, 'nzburl': REDACTED URL', 'nzbprov': 'https://api.nzbgeek.info'}

2016-01-25 15:28:21 DEBUG Found 12 nzb at https://api.nzbgeek.info for: Eric Flint 1635 A Parcel of Rogues

2016-01-25 15:28:21 DEBUG NZB token set Match %: 45 for Eric Flint Trails Glory The Rivers War retail epub

2016-01-25 15:28:21 DEBUG NZB token set Match %: 45 for Eric Flint Trails Glory The Arkansas War retail epub

2016-01-25 15:28:21 DEBUG NZB token set Match %: 45 for Eric Flint Assiti Shards Ring Fire Press Second Chance Bird Garrett W Vance epub

2016-01-25 15:28:21 DEBUG NZB token set Match %: 48 for Flint Eric Ring Fire The Cannon Law v

2016-01-25 15:28:21 DEBUG NZB token set Match %: 45 for Flint Eric Ring Fire The Ram Rebellion & Virginia DeMarce

2016-01-25 15:28:21 DEBUG NZB token set Match %: 45 for Flint Eric Ring Fire The Galileo Affair & Dennis Andrew

2016-01-25 15:28:21 DEBUG NZB token set Match %: 50 for Flint Eric & Spoor Ryk E Boundary Threshold epub

2016-01-25 15:28:21 DEBUG NZB token set Match %: 51 for Flint Eric & Spoor Ryk E Boundary Boundary epub

2016-01-25 15:28:21 DEBUG NZB token set Match %: 57 for Second Edition Eric Flint

2016-01-25 15:28:21 DEBUG NZB token set Match %: 61 for Ring Fire II Eric Flint

2016-01-25 15:28:21 DEBUG NZB token set Match %: 67 for Ring Fire Eric Flint

2016-01-25 15:28:21 DEBUG NZB token set Match %: 100 for Eric Flint

2016-01-25 15:28:21 DEBUG Found NZB: 1633 - Eric Flint using book search

2016-01-25 15:28:21 DEBUG Request url for SABnzbd

2016-01-25 15:28:22 DEBUG Sending Nzbfile to SAB URL

2016-01-25 15:28:22 DEBUG Sending Nzbfile to SAB

2016-01-25 15:28:22 DEBUG Result text from SAB: ok

2016-01-25 15:28:22 INFO Eric Flint - 1635 A Parcel of Rogues LL.(25110899) sent to SAB successfully.

2016-01-25 15:28:22 DEBUG Nzbfile has been downloaded from REDACTED NZBGEEK LINK

2016-01-25 15:28:22 DEBUG Start processDir job, already scheduled

2016-01-25 15:28:22 INFO NZBSearch for Wanted items complete, found 1 book

2016-01-25 15:28:23 DEBUG NZB search requested for no books

philborman commented 8 years ago

Interesting, need to rethink the name matching code. It looks for Eric Flint 1635 A Parcel of Rogues and none of the early matches work as they have words in the titles that we don't want. Then it gets an nzb just called "Eric Flint" and thinks - hey, 100% of the words match! I will think about it...

philborman commented 8 years ago

I've added a new section of checks to the name matching, it now checks separately for author and book title rather than both as a set. I've also removed the code that stripped short words out of the check, it used to strip out "the a and to of for my in at with"

Not sure what the thinking was by the original author, might be a good reason and we'll fall foul of it later... I've only commented it out so can easily put it back, but it seems to work ok here.

ayefly commented 8 years ago

It found several of his books now, and downloaded the correct one. However, I am curious to see if its possible to have a set of keywords to reject? for example mp3 or audiobook. For one author Michael J sullivan it only downloads audiobooks, and the files have audiobook in the title. sometimes its attached to the title, such as riyeria revelationsaudiobook.

it also is pulling ARC ebooks, advanced reader copies, for some that have them available. that could let you add ARC to the filter list, to sort those out too.

philborman commented 8 years ago

Yes, could be useful. I will look into it.

On 26/01/16 22:41, ayefly wrote:

It found several of his books now, and downloaded the correct one. However, I am curious to see if its possible to have a set of keywords to reject? for example mp3 or audiobook. For one author Michael J sullivan it only downloads audiobooks, and the files have audiobook in the title. sometimes its attached to the title, such as riyeria revelationsaudiobook.

— Reply to this email directly or view it on GitHub https://github.com/DobyTang/LazyLibrarian/issues/265#issuecomment-175245239.

philborman commented 8 years ago

ok, added a "reject list" so we can ignore titles with certain words in them. Just a bit more testing and should release later today if its ok