theotherp / nzbhydra2

Usenet meta search
Other
1.23k stars 75 forks source link

[BUG] generated season search query -> no results from nzbfinder #835

Open pproba opened 1 year ago

pproba commented 1 year ago

Greetings!

I hope I can omit attaching the zip file with logs because I've already identified the root cause. If it matters, I'm using version 5.1.1 on Windows 11.

I was not getting any results from nzbfinder when searching for season 1 of from nzbfinder, even though I was able to find individual episodes.

Then I had a look at the generated queries in the logs:

2023-01-28 10:53:31.464  INFO --- [http-nio-0.0.0.0-5] org.nzbhydra.searching.SearchWeb         : [ID: 29573, Host: <ip>] New search request: SearchRequest{source=INTERNAL, indexers=[NZB Finder], searchType=TVSEARCH, category=TV, offset=0, limit=100, minsize=50, maxsize=5000, identifiers={TVMAZE=<tvmazeid>, TVIMDB=<tvimdb>, TVDB=<tvdb>}, title=<show_name>, season=1}
2023-01-28 10:53:31.464 DEBUG --- [http-nio-0.0.0.0-5] o.n.searching.IndexerForSearchSelector   : [ID: 29573, Host: <ip>] Picking indexers out of 5
2023-01-28 10:53:31.464  INFO --- [http-nio-0.0.0.0-5] o.n.searching.IndexerForSearchSelector   : [ID: 29573, Host: <ip>] Not using newz-complex because it's not in selection [NZB Finder]
2023-01-28 10:53:31.465  INFO --- [http-nio-0.0.0.0-5] o.n.searching.IndexerForSearchSelector   : [ID: 29573, Host: <ip>] Not using NZBGeek because it's not in selection [NZB Finder]
2023-01-28 10:53:31.465  INFO --- [http-nio-0.0.0.0-5] o.n.searching.IndexerForSearchSelector   : [ID: 29573, Host: <ip>] Not using nzbplanet because it's not in selection [NZB Finder]
2023-01-28 10:53:31.465  INFO --- [http-nio-0.0.0.0-5] o.n.searching.IndexerForSearchSelector   : [ID: 29573, Host: <ip>] Not using Drunken Slug because it's not in selection [NZB Finder]
2023-01-28 10:53:31.465  INFO --- [http-nio-0.0.0.0-5] o.n.searching.IndexerForSearchSelector   : [ID: 29573, Host: <ip>] Selected 1 out of 5 indexers: NZB Finder
2023-01-28 10:53:31.465 DEBUG --- [http-nio-0.0.0.0-5] org.nzbhydra.searching.Searcher          : [ID: 29573, Host: <ip>] Going to call NZB Finder because their cache is exhausted
2023-01-28 10:53:31.466 DEBUG --- [  pool-38-thread-1] org.nzbhydra.indexers.QueryGenerator     : [ID: 29573, Host: <ip>] No query generation needed for NZB Finder. indexerDoesntSupportRequiredSearchType: false. indexerDoesntSupportAnyOfTheProvidedIds: false. queryGenerationPossible: true. queryGenerationEnabled: true. fallbackRequested: false
2023-01-28 10:53:31.466  INFO --- [  pool-38-thread-1] org.nzbhydra.indexers.Newznab            : [ID: 29573, Host: <ip>] NZB Finder: Calling https://nzbfinder.ws/api?apikey=<apikey>&t=tvsearch&extended=1&tvmazeid=<tvmazeid>&imdbid=<imdbid>&tvdbid=<tvdbid>&season=1&minsize=50&password=1&cat=5000&limit=1000&offset=0
2023-01-28 10:53:31.523 DEBUG --- [  pool-38-thread-1] org.nzbhydra.indexers.Newznab            : [ID: 29573, Host: <ip>] NZB Finder: Found 0 results which were already in the database and 0 new ones
2023-01-28 10:53:31.524  INFO --- [  pool-38-thread-1] org.nzbhydra.indexers.Newznab            : [ID: 29573, Host: <ip>] NZB Finder: Successfully executed search call in 56ms with 0 total results
2023-01-28 10:53:31.524 DEBUG --- [  pool-38-thread-1] org.nzbhydra.indexers.Newznab            : [ID: 29573, Host: <ip>] NZB Finder: Returning results 0-0 of 0 available (0 already rejected)
2023-01-28 10:53:31.524  INFO --- [  pool-38-thread-1] org.nzbhydra.indexers.Newznab            : [ID: 29573, Host: <ip>] NZB Finder: No results found for ID based search. Will do a fallback search using a generated query
2023-01-28 10:53:31.524 DEBUG --- [  pool-38-thread-1] org.nzbhydra.indexers.QueryGenerator     : [ID: 29573, Host: <ip>] Search request provided title <show_name>. Using that as query base.
2023-01-28 10:53:31.524 DEBUG --- [  pool-38-thread-1] org.nzbhydra.indexers.QueryGenerator     : [ID: 29573, Host: <ip>] Indexer does not support any of the supplied IDs or the requested search type. The following query was generated: <show_name>
2023-01-28 10:53:31.524  INFO --- [  pool-38-thread-1] org.nzbhydra.indexers.Newznab            : [ID: 29573, Host: <ip>] NZB Finder: Calling https://nzbfinder.ws/api?apikey=<apikey>&t=tvsearch&extended=1&q=<show_name>&season=1&minsize=50&password=1&cat=5000&limit=1000&offset=0
2023-01-28 10:53:32.016 DEBUG --- [  pool-38-thread-1] org.nzbhydra.indexers.Newznab            : [ID: 29573, Host: <ip>] NZB Finder: Found 0 results which were already in the database and 0 new ones
2023-01-28 10:53:32.016  INFO --- [  pool-38-thread-1] org.nzbhydra.indexers.Newznab            : [ID: 29573, Host: <ip>] NZB Finder: Successfully executed search call in 491ms with 0 total results
2023-01-28 10:53:32.017 DEBUG --- [  pool-38-thread-1] org.nzbhydra.indexers.Newznab            : [ID: 29573, Host: <ip>] NZB Finder: Returning results 0-0 of 0 available (0 already rejected)
2023-01-28 10:53:32.017 DEBUG --- [http-nio-0.0.0.0-5] org.nzbhydra.searching.Searcher          : [ID: 29573, Host: <ip>] All indexer caches exhausted
2023-01-28 10:53:32.017 DEBUG --- [http-nio-0.0.0.0-5] o.nzbhydra.searching.DuplicateDetector   : [ID: 29573, Host: <ip>] Duplicate detection for 0 search results found 0 duplicates
2023-01-28 10:53:32.017 DEBUG --- [http-nio-0.0.0.0-5] org.nzbhydra.searching.Searcher          : [ID: 29573, Host: <ip>] Will load all cached results
2023-01-28 10:53:32.018  INFO --- [http-nio-0.0.0.0-5] org.nzbhydra.searching.SearchWeb         : [ID: 29573, Host: <ip>] Web search took 553ms

This is the first (id based) query: https://nzbfinder.ws/api?apikey=<apikey>&t=tvsearch&extended=1&tvmazeid=<tvmazeid>&imdbid=<imdbid>&tvdbid=<tvdbid>&season=1&minsize=50&password=1&cat=5000&limit=1000&offset=0

It returned zero results because the episodes of the show I'm searching have been miscataloged (wrong show id).

And this is the fallback (generated) query: https://nzbfinder.ws/api?apikey=<apikey>&t=tvsearch&extended=1&q=<show_name>&season=1&minsize=50&password=1&cat=5000&limit=1000&offset=0

It also returned zero results, but it should've been able to find the miscataloged episodes.

While nzbfinder has no issues with a missing episode number in the ID based query (verified this with another show), the fallback query will only return results if the season identifier (e.g. 's01') is delimited (by spaces, periods, etc.) - as it would be the case for a season pack. For this show, there were no season packs.

If an episode wildcard '&ep=?' is added to the query, all matching episodes are returned. https://nzbfinder.ws/api?apikey=<apikey>&t=tvsearch&extended=1&q=<show_name>&season=1&ep=?&minsize=50&password=1&cat=5000&limit=1000&offset=0

In this case, I'm getting 719 results from nzbfinder.

So in my opinion, the generated query for nzbfinder should include '&ep=?' whenever no episode number is specified in the nzbhydra search.

I'll also check the behavior of my other indexers to see if nzbfinder is special in this regard or if this would be a potential improvement for all indexers. I'll add comments or edit the ticket then.

pproba commented 1 year ago

Update:

So in summary, only nzbfinder and drunkenslug would benefit from the addition of an episode wildcard. newz-complex, nzbgeek and nzbplanet neither like nor need the wildcard.

theotherp commented 1 year ago

Thanks for the detailed analysis. That's a pretty shitty implementation on the indexer's side but I'm kinda used to working around their idiosyncrasies.

theotherp commented 1 year ago

I cannot reproduce that. https://drunkenslug.com/api?apikey=APIKEY&t=tvsearch&q=lost&season=1 returns results.

pproba commented 1 year ago

It's not that obvious. All returned results have delimiters around the search term 's01'. So something like 'Lost.S01E04.1080p.BluRay.x265-RARBG' is not found. If you add '&ep=?' to the query, it's included in the results.

I'm getting 5454 results with ep=? and 320 results without it.

theotherp commented 1 year ago

Right you are. You didn't by any chance ask one of the indexer's admins about this?

pproba commented 1 year ago

No, sorry. I've only tried to look for detailed (!) API documentation, but couldn't find anything.

theotherp commented 1 year ago

The thing is, the results seem to be exclusive. Neither result of one page turns up in the other. So I wonder which is more right? You might say the one with more results but it looks a bit iffy.

I'll contact the DS admin.

pproba commented 1 year ago

Oh wow, you're right. I'd assume the combination of both would be the preferred result.

theotherp commented 1 year ago

Admin response:

Did have some time to look at it. Without ep= it just shows 1 season pack for S01. Not many season packs on Usenet so most queries like that will show 0 results. That query is meant for looking at season packs IIRC.

I have no experience with NZBHydra, but AFAIK Sonarr works like this:

Adding the ep= is what Sonarr usually uses to look for a specific episode if it can't find season packs (which it almost never does). So if Sonarr can't find any season packs it just queries each episodes separately. It's been a while I looked at it though so it might be different now. There's a chance if I change the way this works, Sonarr might treat single episodes as season packs and that's something nobody needs :-)

Adding that wildcard * or ? will apparently look for all episodes for that season.

Still doesn't make much sense to me...

pproba commented 1 year ago

So I just tried to verify what the admin said about the Sonarr behavior. I was searching for an unaired season and disabled the query generation in hydra. Here's the Sonarr queries:

Received external newznab API call: NewznabParameters{t=TVSEARCH, cat=[5000], imdbId=tt8289930, tvdbId=359913, tvmazeId=41074, season=5, offset=0, limit=100, raw=false, o=XML, attrs=[], extended=true, indexers=[]}

Received external newznab API call: NewznabParameters{t=TVSEARCH, q=Formel 1 Drive to Survive, cat=[5000], season=5, offset=0, limit=100, raw=false, o=XML, attrs=[], extended=true, indexers=[]}

Received external newznab API call: NewznabParameters{t=TVSEARCH, q=Formula 1 La Emocion de un Grand Prix, cat=[5000], season=5, offset=0, limit=100, raw=false, o=XML, attrs=[], extended=true, indexers=[]}

Received external newznab API call: NewznabParameters{t=TVSEARCH, q=Formula 1 Drive to Survive, cat=[5000], season=5, offset=0, limit=100, raw=false, o=XML, attrs=[], extended=true, indexers=[]}

Not a single result was returned to Sonarr.

So all it did was fall back to generated queries with the show name in different languages, but always including the season number without any episode attribute. It didn't start including wildcards. It also didn't start searching for individual episodes after receiving 0 results.

theotherp commented 1 year ago

I suggest you talk to them directly (and DS).

pproba commented 1 year ago

I've opened a sonarr reddit thread for now. How did you find the DS Admin's contact information in case I want to contact them? All I can see is a "Contact us" form.

theotherp commented 1 year ago

I used that.

On Sun, 5 Feb 2023, 16:18 pproba, @.***> wrote:

I've opened a sonarr reddit thread for now. How did you find the DS Admin's contact information in case I want to contact them? All I can see is a "Contact us" form.

— Reply to this email directly, view it on GitHub https://github.com/theotherp/nzbhydra2/issues/835#issuecomment-1418031361, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADNUA6JORUOJM4YSYOQYC53WV7AFNANCNFSM6AAAAAAUJSJLGU . You are receiving this because you commented.Message ID: @.***>

DariusIII commented 1 year ago

This behavior is not an indexers fault, but the way underlying code works, both nZEDb (DS) and NNTmux (Nzbfinder, Tabula Rasa). I will change the behaviour in NNTmux but, just for note, nZEDb and NNTmux have worked this way for years so i am puzzled how this popped up now. Have there been any changes that could trigger the bug in hydra and *arrs?

theotherp commented 1 year ago

No changes on my side. Might just be that @pproba is the first one to find it. Thanks for looking into it.

DariusIII commented 1 year ago

It's not that obvious. All returned results have delimiters around the search term 's01'. So something like 'Lost.S01E04.1080p.BluRay.x265-RARBG' is not found. If you add '&ep=?' to the query, it's included in the results.

I'm getting 5454 results with ep=? and 320 results without it.

If you have S04E01 in search title, no need for season and ep, right? I know hydra and *arrs follow newznab api, but that api has it's own flaws.

pproba commented 1 year ago

This behavior is not an indexers fault, but the way underlying code works, both nZEDb (DS) and NNTmux (Nzbfinder, Tabula Rasa). I will change the behaviour in NNTmux but, just for note, nZEDb and NNTmux have worked this way for years so i am puzzled how this popped up now. Have there been any changes that could trigger the bug in hydra and *arrs?

Thanks for looking into this. Will you create a ticket for NNTmux? Then we could potentially use it (as 'leverage') to convince nZEDb to implement the same change?

Regarding recent changes? I'm not too sure, but there was the idea floating around that Sonarr would start searching for single episodes if a season search was unsuccessful - this is not the case, at least not today. It might've been in previous revisions.

Apart from that, what made me look into this behavior is probably a pretty rare scenario: The only indexer which had the releases I was looking for had the wrong show IDs attached to the episodes.

On top of that, without the logs from hydra I would've never been able to figure out what was going on. There are lots of threads on reddit where this behavior could've been the root cause, but it was never found due to insufficient knowledge/logs.

If you have S04E01 in search title, no need for season and ep, right? I know hydra and *arrs follow newznab api, but that api has it's own flaws.

I don't get the background of this question, but I'd agree. This only works for episode searches though. I've experimented with manually generated season search queries which included both the show title and the season number, but I failed. It seemed like all search terms needed to be delimited in the results. It would require a regex like `s04(e[0-9]+)?' to get both season packs and individual episodes.

DariusIII commented 1 year ago

I am the dev of NNTmux as well as one of the remaining devs of nZEDb. Unfortunately, nZEDb had no updates in a looong time and is behind in so many things that setting up a dev environment is PITA on its own. I will fix it in NNTmux and then will try to do it in nZEDb.

I don't get the background of this question, but I'd agree. This only works for episode searches though. I've experimented with manually generated season search queries which included both the show title and the season number, but I failed. It seemed like all search terms needed to be delimited in the results. It would require a regex like `s04(e[0-9]+)?' to get both season packs and individual episodes.

There is a regex that strips S0 from seasons, so if someone searches for season S01, S02, S03 it will be stripped to 1,2,3. That is why you cannot search for S01E04 together with season query as it will effectively become 1E04 which is meaningless. Same goes for E0, it gets stripped if ep= is used.

ghost commented 1 year ago

So I checked some logs on the indexers side and Sonarr actually does this when clicking the magnifying glass for a season:

"GET /api?t=tvsearch&cat=5010,5030,5040,5045,5060,5080,5090&extended=1&apikey=REMOVED&offset=0&limit=100&tvdbid=311809&imdbid=tt5541338&tvmazeid=11311&season=1
"GET /api?t=tvsearch&cat=5010,5030,5040,5045,5060,5080,5090&extended=1&apikey=REMOVED&offset=100&limit=100&tvdbid=311809&imdbid=tt5541338&tvmazeid=11311&season=1
"GET /api?t=tvsearch&cat=5010,5030,5040,5045,5060,5080,5090&extended=1&apikey=REMOVED&offset=200&limit=100&tvdbid=311809&imdbid=tt5541338&tvmazeid=11311&season=1

So you're right in saying it doesn't actually query the specific episodes separately, it just picks them separately from the same search result.

When you click a specific episode however it does the following:

GET /api?t=tvsearch&cat=5010,5030,5040,5045,5060,5080,5090&extended=1&apikey=REMOVED&offset=0&limit=100&q=SHOWNAME&season=2&ep=1

So I guess that this might have been one of those weird bugs that've been there forever as it only searches by show name directly if there are no results on the TVMaze, TMDB IDs etc.

pproba commented 1 year ago

Hi @DariusIII, any update on this? Do you track this issue in a github ticket? I couldn't find one in NNTmux/newznab-tmux.

DariusIII commented 1 year ago

@pproba I still haven't found the exact cause of this issue in my dev environment. I have an idea, was tracking it on live server, but there were no queries like the ones you used, so could not pinpoint it.

Edit: Opened an issue on NNTmux/newznab-tmux

DariusIII commented 1 year ago

I have pushed a possible fix for this issue: https://github.com/NNTmux/newznab-tmux/commit/9ef5ec8311fd938743e77ae5f9fa269917f2e256

DariusIII commented 1 year ago

Issue has been fixed by updating underlying search engine config (namely ManticoreSearch). It should return proper values now.

theotherp commented 1 year ago

Great to hear, thanks for the work.

On Sun, 26 Feb 2023, 14:28 DariusIII, @.***> wrote:

Issue has been fixed by updating underlying search engine config (namely ManticoreSearch). It should return proper values now.

— Reply to this email directly, view it on GitHub https://github.com/theotherp/nzbhydra2/issues/835#issuecomment-1445362409, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADNUA6KLLUR32BYFOKGMS2LWZNLBTANCNFSM6AAAAAAUJSJLGU . You are receiving this because you commented.Message ID: @.***>