PsychedelicShayna / tixati-python-api-cli

A Python API and CLI combo for the Tixati torrent client's webserver interface.
GNU General Public License v3.0
12 stars 0 forks source link

Tixati API cannot Parse the html code using RegEx on Windows #1

Closed zibranpython closed 2 years ago

zibranpython commented 2 years ago

Thanks for this api.

But i have been having issues with this on windows. I tried to figure what the issue is but i cant seem to find out. I have Tixati installed on both, a linux machine and a windows machine. And as given the the documentation, i added the transfersscrape.html to the web interface - html templates. tixati-python-api-cli works fine on linux whereas it cannot get the list of all the downloads on windows. I tried to manually match the TRANSFERS_PAGE_HTML_SCRAPER pattern, on windows it returns an empty list whereas on linux it works fine. Also, i can add downloads using the api on both linux and windows.

Attaching the html code received (from windows and linux machines) after the "get request" to the tixati web Interface

html_linux.txt html_win.txt .

PsychedelicShayna commented 2 years ago

Sorry for the delayed reply. I see why the RegEx isn't matching on the HTML that's being returned on Windows, after looking at the diff of both files:

(Linux)

<       <tr class="seeding_even">
...

(Windows)

>       <tr class="$statusclass_alt$">
...

As you can see, $statusclass_alt$ isn't being resolved into what the class name should be in this case: seeding_even. This seems to be the case for all of the <tr> table row class names in the HTML being returned by your Windows instance, and the RegEx doesn't expect this as a potential class name, it expects the following:

<tr class=\"(downloading|complete|seeding|offline|queued|standbyseed)_(?:odd|even)\">

I'm not the creator of the original HTML document, though there doesn't seem to be any JavaScript in there that would be responsible for this sort of variable substitution, and this isn't a standard HTML feature (as far as I'm aware) so I'm guessing that the Tixati webserver is responsible for filling in these variables, like some form of context-dependent macro. This substitution isn't being performed by your Windows Tixati webserver instance, and that's bricking the RegEx.

I know that this isn't an issue with the Windows version itself, since Windows is the platform I originally wrote this API for and it works just fine on my end. This might be a version issue; you could be running an outdated version on Windows that doesn't support this variable substitution feature, or maybe older versions of Tixati use different variable names that the HTML document wasn't designed for, or perhaps you're actually running the latest version and it's the HTML document that's outdated after the variable names have been changed in a recent update that I'm not aware of.

Ensure that you're running the latest version of Tixati on Windows, and if you already are, or updating doesn't change anything, please supply the version numbers of both of your Tixati instances.

zibranpython commented 2 years ago

Thank you for the reply. I understand now what the issue is. So suppose if I add a code to replace with

in the received html source, then I think it should work. I'll try it and let you know. Thank you again. Regards On Sun, 13 Mar 2022 at 00:50, Shayna ***@***.***> wrote: > Sorry for the delayed reply. I see why the RegEx isn't matching on the > HTML that's being returned on Windows, after looking at the diff of both > files: > > (Linux) > > < > ... > > (Windows) > > > > ... > > As you can see, $statusclass_alt$ isn't being resolved into what the > class name should be in this case: seeding_even. This seems to be the > case for all of the table row class names in the HTML being returned > by your Windows instance, and the RegEx doesn't expect this as a potential > class name, it expects the following: > > class=\"(downloading|complete|seeding|offline|queued|standbyseed)_(?:odd|even)\"> > > I'm not the creator of the original HTML document, though there doesn't > seem to be any JavaScript in there that would be responsible for this sort > of variable substitution, and this isn't a standard HTML feature (as far as > I'm aware) so I'm guessing that the Tixati webserver is responsible for > filling in these variables, like some form of context-dependent macro. This > substitution isn't being performed by your Windows Tixati webserver > instance, and that's bricking the RegEx. > > I know that this isn't an issue with the Windows version itself, since > Windows is the platform I originally wrote this API for and it works just > fine on my end. This might be a version issue; you could be running an > outdated version on Windows that doesn't support this variable substitution > feature, or maybe older versions of Tixati use different variable names > that the HTML document wasn't designed for, or perhaps you're actually > running the latest version and it's the HTML document that's outdated after > the variable names have been changed in a recent update that I'm not aware > of. > > Ensure that you're running the latest version of Tixati on Windows, and if > you already are, or updating doesn't change anything, please supply the > version numbers of both of your Tixati instances. > > — > Reply to this email directly, view it on GitHub > , > or unsubscribe > > . > Triage notifications on the go with GitHub Mobile for iOS > > or Android > . > > You are receiving this because you authored the thread.Message ID: > ***@***.***> >
zibranpython commented 2 years ago

I just changed regex_results = self.TRANSFERS_PAGE_HTML_SCRAPER.findall(response.content.decode()) to regex_results = self.TRANSFERS_PAGE_HTML_SCRAPER.findall(response.content.decode().replace('<tr class="$statusclass_alt$">', '<tr class="seeding_even">'))

in the tixati_api.py on line 37 and it worked. Thank you.

Edit : I am using windows 8 with the latest version of Tixati.

PsychedelicShayna commented 2 years ago

This isn't a solution, because $statusclass_alt$ doesn't have a fixed value, it's supposed to be replaced with the status of that particular transfer! Instances of $statusclass_alt$ aren't guaranteed to be seeding_even, it's just seeding_even in that particular case because the transfer is seeding, but it could be complete, offline, queued, standbyseed, etc. The Tixati webserver is supposed to replace $statusclass_alt$ with the status of that specific transfer, which it isn't doing. Manually replacing $statusclass_alt$ with a fixed status like seeding_even gives the API no way to know the status of the transfer, all transfers will now be seeding, even those which aren't, and it doesn't change the fact that the Tixati webserver isn't doing its job!

You're not going to get the full functionality of the API like this, I would only do this as a last resort if nothing else works. Please, I need the version numbers of both your Windows and Linux instances of Tixati so that I can compare them and have a better chance at reproducing the error on my end.

PsychedelicShayna commented 2 years ago

I just checked the latest version of Tixati on Windows, major changes seem to have been made to the way the web interface works. I'm going to have to make some heavy modifications to transferscrape.html and possibly the API as well, for it to be compatible with the newer versions. As of right now, I know that version 2.59 works as intended, so you can either downgrade to that version (or the version you're running on Linux) or wait for me to update transferscrape.html to be compatible with the latest version of Tixati.

PsychedelicShayna commented 2 years ago

Alright, it turns out that the fix was quite simple, no heavy modifications required! I've updated transfersscrape.html and tixati_api.py to work with the latest version of Tixati, please download the updated versions of both files from the repository, and confirm that it works!

zibranpython commented 2 years ago

Thank you and apologies for the delayed response. Downloading the files and let you know how it goes.

PsychedelicShayna commented 2 years ago

@zibranpython Can you confirm that the issue has indeed been solved? I'm going to close the issue soon.

zibranpython commented 2 years ago

Yes, it is working.

Apologies for the late reply.

On Wed, 23 Mar 2022 at 10:12, Shayna @.***> wrote:

@zibranpython https://github.com/zibranpython Can you confirm that the issue has indeed been solved? I'm going to close the issue soon.

— Reply to this email directly, view it on GitHub https://github.com/PsychedelicShayna/tixati-python-api-cli/issues/1#issuecomment-1075908384, or unsubscribe https://github.com/notifications/unsubscribe-auth/AURC7Z7LUYTJV56L3VFACN3VBKOKZANCNFSM5QBAQPQQ . You are receiving this because you were mentioned.Message ID: @.***>

PsychedelicShayna commented 2 years ago

Awesome! Thank you!