Jackett / Jackett

API Support for your favorite torrent trackers
GNU General Public License v2.0
12.1k stars 1.29k forks source link

[desitorrents] Error while parsing row #9849

Closed theworldhealer1 closed 3 years ago

theworldhealer1 commented 3 years ago

I am getting the following error,

System.Exception: Error while parsing field=grabs, selector=th:nth-child(11), value=242020-10-12 17:03:25: Input string was not in a correct format.
   at Jackett.Common.Indexers.CardigannIndexer.PerformQuery(TorznabQuery query) in /home/vsts/work/1/s/src/Jackett.Common/Indexers/CardigannIndexer.cs:line 1610 
2020-10-15 11:36:17.7745 Error CardigannIndexer (desitorrents): Error while parsing row '
<tr>
        <th>
                <img src="/media/images/categories/Telugu Movies.png">
        </th>
        <th style="text-align: left; padding-left: 1em;">
                <a style="color:#e19804" href="/torrents.php?id=197592">blahblah</a>
        </th>
        <th style="white-space: nowrap; color: grey; font-size: 8pt">
                <a id="sort_by_user_hyper" style="cursor: pointer" title="Show only from this user">Chittitambi
                </a>
        </th>
        <th>
                <a id="bookmark_torrent_index" class="bookmark_torrent_index bookmark_197592">
                        <img src="/media/images/table/white-star.png">
                </a>
        </th>
        <th>
                <a href="/torrents.php?action=download&amp;id=197592&amp;authkey=">
                        <img src="/media/images/table/orange-download.png">
                </a>
        </th>
        <th>2</th>
        <th>7.88 GB</th>
        <th>25</th>
        <th>11</th>
        <th>0</th>
        <th>222020-10-12 16:44:00</th>
</tr>':

System.Exception: Error while parsing field=grabs, selector=th:nth-child(11), value=222020-10-12 16:44:00: Input string was not in a correct format.
theworldhealer1 commented 3 years ago

The spam filter is somehow detecting this as offensive so i have pastebinned the log here. Please ignore the pastebin warning message.

garfield69 commented 3 years ago

Hmmm. looks like the web site is producing invalid HTML

        <th>2</th>
        <th>7.88 GB</th>
        <th>25</th>
        <th>11</th>
        <th>0</th>
        <th>222020-10-12 16:44:00</th>

the columns are meant to be: comments,size,files,seeds,leech,completed,date but as you can see above, the columns for completed and the date have merged into one and this is why the indexer is rejecting it. looking at the HTML you provided

<th>2</th><th>11.72 GB</th><th>1</th><th>8</th><th>0</th><th>9</th><th>2020-10-14 19:39:35<th>

suggests that someone on site has already changed to partially fix it, although now the date field does not have a proper </th>

are you still getting this error when using the indexer?

I will add the new categories, and add the date column, and wait for you to tell me if the grabs (completed) is generating an error still for your indexer. Thanks.

theworldhealer1 commented 3 years ago

ok, let me know once you have updated the yml file and i will try to test it.

garfield69 commented 3 years ago

need to know first if the existing indexer is giving you errors still ;-)

theworldhealer1 commented 3 years ago

I did a 'Test' with Jackett Version 0.16.1724.0 and it failed.

CardigannIndexer (desitorrents): Error while parsing row '
<tr>
    <th>
        <img src="/media/images/categories/Marathi Movies.png">
    </th>
    <th style="text-align: left; padding-left: 1em;">
        <a style="color:#e19804" href="/torrents.php?id=197585">blahblah</a>
    </th>
    <th style="white-space: nowrap; color: grey; font-size: 8pt">
        <a id="sort_by_user_hyper" style="cursor: pointer" title="Show only from this user">LOKIi
        </a>
    </th>
    <th>
        <a id="bookmark_torrent_index" class="bookmark_torrent_index bookmark_197585">
            <img src="/media/images/table/white-star.png">
        </a>
    </th>
    <th>
        <a href="/torrents.php?action=download&amp;id=197585&amp;authkey=">
            <img src="/media/images/table/orange-download.png">
        </a>
    </th>
    <th></th>
    <th>3.37 GB</th>
    <th>1</th>
    <th>4</th>
    <th>0</th>
    <th>112020-10-11 18:00:39</th>
</tr>':

System.Exception: Error while parsing field=grabs, selector=th:nth-child(11), value=112020-10-11 18:00:39: Input string was not in a correct format.
   at Jackett.Common.Indexers.CardigannIndexer.PerformQuery(TorznabQuery query) in /home/vsts/work/1/s/src/Jackett.Common/Indexers/CardigannIndexer.cs:line 1610
garfield69 commented 3 years ago

OK thanks. I don't suppose you have an invite you could send to garfieldsixtynine -at- gmail.com ? it would make the whole process of testing and validating my changes faster.

theworldhealer1 commented 3 years ago

You should have it.

garfield69 commented 3 years ago

Fantastic. Confirming registration and login was successful. I shall let you know when the fix is published.

garfield69 commented 3 years ago

took longer that predicted, the indexer needed quite a bit of rework to bring it up to spec. the fix for the grabs/date I expect will eventually be corrected at the web site, which means it will break the indexer again, so keep an eye out, as we will probably need to revert/correct one more time. should be available in the next Jackett due out in about 3 hours, I'll confirm with a post here when its published.

theworldhealer1 commented 3 years ago

Thanks for fixing this.

I copied the yml file to my existing installation, and this time the test passed.

The next thing I did was to check against a recent TV release (tvdbid 389680).

I tried another query for the whole season (tvdbid 384578)

I dont know if this is a shortcoming of Sonarr or Jackett.

garfield69 commented 3 years ago

since Scam 1992 The Harshad Mehta Story S01E02 does not exist in desitorrents I hardly think you can expect the indexer to find it ;-) image shows some strange season episode naming schemes and since there is no consistency its difficult to see if we can massage these into a SxxExx standard.

image the poison S01 on the web site is in the movies category so sonarr is not going to find this miss-catalogued tv series

garfield69 commented 3 years ago

Jackett 0.16.1757 for the official release ;-)

theworldhealer1 commented 3 years ago

Thanks for clarifying!

I saw this in the log,

CardigannIndexer (desitorrents): Error while parsing row '
<tr>
    <th>
        <img src="/media/images/categories/Classical Music.pngrn">
    </th>
    <th style="text-align: left; padding-left: 1em;">
        <a style="color:#e19804" href="/torrents.php?id=197466">Ragamala Vol 6. Bhimpalasi. 20xx VINYLrip 320KBPSCBR SWARINT MP3</a>
    </th>
    <th style="white-space: nowrap; color: grey; font-size: 8pt">
        <a id="sort_by_user_hyper" style="cursor: pointer" title="Show only from this user">swarint
        </a>
    </th>
    <th>
        <a id="bookmark_torrent_index" class="bookmark_torrent_index bookmark_197466">
            <img src="/media/images/table/white-star.png">
        </a>
    </th>
    <th>
        <a href="/torrents.php?action=download&amp;id=197466&amp;authkey=">
            <img src="/media/images/table/orange-download.png">
        </a>
    </th>
    <th>1</th>
    <th>217 MB</th>
    <th>10</th>
    <th>6</th>
    <th>0</th>
    <th>192020-10-05 17:36:55</th>
</tr>':

System.Exception: Error while parsing field=category, selector=th:first-child, value=<null>: None of the case selectors "[img[src$="Bollywood Movies.png"], 47],[img[src$="Bengali Movies.png"], 48],[img[src$="Tamil Films.png"], 49],[img[src$="Punjabi Movies.png"], 51],[img[src$="Marathi Movies.png"], 52],[img[src$="Malayalam Movies.png"], 53],[img[src$="Kannada Movies.png"], 54],[img[src$="Gujarati Movies.png"], 55],[img[src$="Foreign Movies.png"], 56],[img[src$="Pakistani Movies.png"], 57],[img[src$="Hollywood Movies.png"], 58],[img[src$="Telugu Movies.png"], 103],[img[src$="south-dubbed.png"], 104],[img[src$="docmentary.png"], 110],[img[src$="Bhojpuri Movies.png"], 117],[img[src$="Movie Packs.png"], 124],[img[src$="Dubbed Movies.png"], 128],[img[src$="Animated Movies.png"], 129],[img[src$="Short Films.png"], 140],[img[src$="Colors TV.png"], 59],[img[src$="Sony TV.png"], 60],[img[src$="AndTV.png"], 61],[img[src$="Star Plus.png"], 62],[img[src$="Zee TV.png"], 63],[img[src$="Life OK.png"], 97],[img[src$="Documentaries.png"], 98],[img[src$="sports.png"], 101],[img[src$="Others-png.png"], 102],[img[src$="Pak-Drama.png"], 113],[img[src$="TV Packs.png"], 125],[img[src$="Star Bharat.png"], 130],[img[src$="Sab TV.png"], 132],[img[src$="Hollywood TV.png"], 139],[img[src$="Music Videos.png"], 67],[img[src$="Hindi Soundtracks.png"], 68],[img[src$="Remix Music.png"], 70],[img[src$="Ghazal Music.png"], 71],[img[src$="Instrumental Music.png"], 72],[img[src$="Telugu Music.png"], 105],[img[src$="Tamil Music.png"], 106],[img[src$="Punjabi Music.png"], 107],[img[src$="Gujarati Music.png"], 108],[img[src$="Music Compilations.png"], 109],[img[src$="Kannada Music.png"], 118],[img[src$="Marathi Gaane.png"], 126],[img[src$="Lollywood Music.png"], 127],[img[src$="Classical Music.png"], 131],[img[src$="Desi Pop Music.png"], 134],[img[src$="Bengali Music.png"], 136],[img[src$="Malayalam Music.png"], 137],[img[src$="PC Games.png"], 78],[img[src$="Mac Games.png"], 79],[img[src$="IOS Games.png"], 80],[img[src$="Android Games.png"], 81],[img[src$="XBOX Games.png"], 83],[img[src$="Playstation Games.png"], 86],[img[src$="Magazines.png"], 92],[img[src$="Novels.png"], 93],[img[src$="Newspapers.png"], 95],[img[src$="AudioBooks.png"], 133],[img[src$="WWE.png"], 114],[img[src$="Cricket.png"], 115],[img[src$="Football.png"], 116],[img[src$="Adult Videos.png"], 89],[img[src$="Adult Pics.png"], 90],[img[src$="Web Series.png"], 135],[img[src$="no.png"], 30]" matched 
<th>
    <img src="/media/images/categories/Classical Music.pngrn">
</th>
   at Jackett.Common.Indexers.CardigannIndexer.PerformQuery(TorznabQuery query) in /home/vsts/work/1/s/src/Jackett.Common/Indexers/CardigannIndexer.cs:line 1610
garfield69 commented 3 years ago

Hmm. the web sites ajax search returns: <img src=\"\/media\/images\/categories\/Classical Music.png\r\n\" and we currently strip the \ out of the response thus generating <img src="/media/images/categories/Classical Music.pngrn"

I may have to relax the category detection login from ends-with Classical Music.png to contains Classical Music.png and that should do the trick.

I'll update the indexer for the next release tomorrow.

garfield69 commented 3 years ago

Jackett 0.16.1771

theworldhealer1 commented 3 years ago

Desitorrents website was recently revamped. Jackett no longer works with it. Please can you check it.

garfield69 commented 3 years ago

upgrade your Jackett to v0.18.329 and afterwards use your Jackett dashboard to edit the desitorrrents config and update the config. #11904