qbittorrent / qBittorrent

qBittorrent BitTorrent client
https://www.qbittorrent.org
Other
28.32k stars 3.98k forks source link

qBT does not call the "torrent_download" method for any search plugin #5026

Closed aleqx closed 8 years ago

aleqx commented 8 years ago

I'm writing a custom search plugin and I've successfully implemented the search method and can get the search results correctly into qBittorrent. So that part is fine.

The site I'm getting results from requires me to go to a 2nd html page to get the .torrent file or magnet: link, which I must parse.

The wiki says I can use the download_torrent method for that, but when I double click on a search result in qBittorrent for this plugin, the download_torrent method is not called at all.

EDIT: In fact, nova2dl.py is not invoked at all by qBT, which is where the download_torrent method is called ... looks like qBT downloads the 1st entry in the info string directly, instead of via nova2dl.py.

I'm using Windows,Python 2.7.11, qBT 3.3.3

What am I missing?

aleqx commented 8 years ago

I just looked at the sources and, sadly, I was right ... in searchwidget.cpp#L370 i see that SearchWidget::on_downloadButton_clicked() calls downloadTorrent(torrentUrl) which in turn adds the url as-is, instead of calling Python on nova2dl.py ...

So nova2dl.py and the download_torrent method in the Python plugins are never called for any of the plugins, including the official ones. It also looks like I'm the first one to actually attempt to use the download_torrent method (?)

I'd be grateful if you could fix this.

zeule commented 8 years ago

Description of this PR says it was an intentional change

aleqx commented 8 years ago

That's a real shame as it completely destroys the ability to search and get results from websites that do not offer a direct download to magnet/torrent in the search results or that require cookie auth to do so ... like rarbg. I didn't see a good reason in that PR for ditching nova2dl, but there is a clear loss of useful functionality (in fact, the PR indicates no reason at all, unless I'm missing something).

The documentation is also wrong and misleading regarding the download_torrent class method.

Would the devs PLEASE consider re-enabling this?

The ability to search and write custom search plugins is the reason I switched from uTorrent. Going back to qBT 3.2.5 for now, but I hope you'll bring this back.

ngosang commented 8 years ago

@DoumanAsh ping.

DoumanAsh commented 8 years ago

I'm not really sure what i can say about it... It is @glassez commit and @sledgehammer999 agreed to merge it.

glassez commented 8 years ago

websites that do not offer a direct download to magnet/torrent in the search results

Please describe in details what do you mean.

I didn't see a good reason in that PR for ditching nova2dl, but there is a clear loss of useful functionality (in fact, the PR indicates no reason at all, unless I'm missing something).

I think we can implement all missing downloading features with builtin download manager. @aleqx What exactly do you miss?

aleqx commented 8 years ago

websites that do not offer a direct download to magnet/torrent in the search results

Please describe in details what do you mean.

Pretty much what it says. The url from the search() method of the python class is not a torrent or magnet link, but a link to a separate page that needs to be fetched and contents to be parsed in order to get the .torrent or magnet link. I also gave an example: rarbg.

This is one reason why the download_torrent() method is needed. Another is cookie authentication. The wiki already mentions these.

I think we can implement all missing downloading features with builtin download manager.

I'm not sure you can. You must allow the plugin writer to implement a custom download_torrent() method to get you the .torrent or magnet link into qBT when needed (e.g. rarbg). Please don't ask plugin writers to write C++ code for all this. The download_torrent() method must be part of the same Python class because the search() method may have setup cookie auth which is needed in download_torrent() (e.g. this would be needed for a rarbg plugin).

The PR linked reads as if you guys just wanted to get rid of nova2dl (for some reason) but weren't quite aware of the consequences :) ... I don't quite understand the decision since python is still called for searching, it's not like the code was cleaned of Python.

zeule commented 8 years ago

You must allow the plugin writer to implement a custom download_torrent() method to get you the .torrent or magnet link into qBT when needed

+1, this is core idea of a search plugin: encapsulate communication with a search engine.

glassez commented 8 years ago

I also gave an example: rarbg.

@aleqx, Where can I get your plugin for review?

Please don't ask plugin writers to write C++ code for all this.

I'm not going to do it.

this is core idea of a search plugin: encapsulate communication with a search engine.

The same problem can be solved in different ways: either write a bunch of code or just use some common tool by setting it a few params. I want to explore the possibility of using the second method.

aleqx commented 8 years ago

I haven't completed the rarbg plugin because I wasted too much time debugging this very issue. I hope you're not going to wait for me before issuing a fix though :) ... the problem is pretty clear.

The same problem can be solved in different ways: either write a bunch of code or just use some common tool by setting it a few params. I want to explore the possibility of using the second method.

It's probably my english, but I don't understand what you mean. All that's needed is a revert to the old code which was working and calling the plugin's download_torrent() method. Why would you not want to call the plugin's download_torrent() method? v3.2.5 does just that ... calls python on nova2dl which in turn calls download_torrent(), which is what's needed (especially if it needs to read in and send a cookie set by the search() method). Why would you want to split the plugin's class? That would be a very bad idea, but maybe I just misunderstand your intentions ...

Frankly, I still don't understand why it was removed, or what purpose it served to remove it.

glassez commented 8 years ago

I haven't completed the rarbg plugin

I'll settle for unfinished code. I just want to see what it really requires. For simplicity, you can insert this code in the comment below.

aleqx commented 8 years ago

Why not try a search on rarbg as I suggested so you can see? You'd understand pretty quickly. It's a little odd that you need so much convincing (it's a pretty clear problem and solution).

Here's my last attempt. This is a search for the string "ubuntu" on rarbg, click here: https://rarbg.to/torrents.php?search=ubuntu and then please look at

here's what happens:

  1. the http response headers set two cookies
  2. the hyperlinks in the search result webpage point each to a separate webpage, they are not links to the torrent/magnet, i.e. they cannot be imported into qBT for download
  3. the html source sets another auth cookie via javascript, this is to prevent bots, i.e. you must parse the http body string not just for links, but also for this cookie, which you must store locally along with the other 2 cookies from the http response headers.
  4. you must make another http request for a search result and send the previous 3 cookies accordingly in the http request headers. For example, click on the Linux Ubuntu 10.04 LTS Desktop Edition entry, or https://rarbg.to/torrent/hzgcrxle2ibaws153qv4dk69ony8jpum7tfuf6a4pqnri2eoh1wjzty5vcmbx8g3sk79dl (not sure if this will still work by the time you click it)
  5. at the top there is a link to the .torrent file which you must parse. You must also update the 3 cookies with the values from the response headers and the javascript of this new page
  6. you must now make a 3rd http request for the .torrent, and again set the previous 3 cookies accordingly in the http headers ... this is what qBT will receive.

The download_torrent() Python method is thus required, since you must make 3 requests to get the .torrent, not just 2, and you must parse and set the cookies every time (or else they lock you out with a captcha).

In other words,

You can't have one in python and the other in C++. They really should both exist in the same Python plugin class.

If you still need evidence, and if fixing this issue relies solely on your decision, then I think I'll just stick with 3.2.5 for the time being and wish you guys good luck.

zeule commented 8 years ago

@glassez ping? I'm suffering from this problem too.

glassez commented 8 years ago

On the search results page we have following:

<a onmouseover="return overlib('<img src=\'//dyncdn.me/static/over/ae505cafbc6c275ee72f1e856dae01e0fc3dc143.jpg\' border=0>')" onmouseout="return nd();" href="/torrent/hzgcrxle2ibaws153qv4dk69ony8jpum7tfuf6a4pqnri2eoh1wjzty5vcmbx8g3sk79dl" title="">Linux Ubuntu 10.04 LTS Desktop Edition</a>

Question: how to get the following from it? https://rarbg.to/download.php?id=hzgcrxle2ibaws153qv4dk69ony8jpum7tfuf6a4pqnri2eoh1wjzty5vcmbx8g3sk79dl&f=Linux%20Ubuntu%2010.04%20LTS%20Desktop%20Edition-[rarbg.com].torrent

Hint: you do not need to load the description page and parse it. Examine for example the official plugins!

zeule commented 8 years ago

@glassez, this is not a solution of the general problem you've created and continue to advocate for a reason (which you do not want to share). Even if some trick works for rarbg, this does not guarantee that there are similar possibilities for any search engine/tracker. This regression has obvious and easy solution, and I believe it is natural to expect from a person who did it to fix the things instead of pretending that there is no problem and making meaningful hints and jokes.

glassez commented 8 years ago

Okay, don't cry. I'll return it as I have time for this.

making meaningful hints and jokes

To use an existing solution to similar problems, instead of reinventing the wheel is a joke to you? Well, please - invent your wheel.

the general problem you've created and continue to advocate for a reason (which you do not want to share)

The only my problem is either I didn't remove reference to download_torrent() from docs or I rushed to include this change in that ill-fated PR. Unfortunately, the qBittorrent development process is too slow and I have to spend too much of my time to support the existing PRs, instead use it to continue the initiated changes. This was only one small step on my way to creating a well-integrated search plugins engine (I can't name the plugins that we use now. It is something foreign, almost not integrated with the application and having external dependencies, which is comparable in size with the application itself). I just did it when it turned up under my arm, but apparently misses with it.

this is not a solution of the general problem

I'm sorry, but I used your answers here to try to better assess this problem if it really needs custom downloading feature in plugin space or just to have a richly parameterized tool for this is enough. What do I mean? I mean to return from plugin not only torrent link, but also some additional parameters like headers, post params and cookies and use these data to download torrent file.

zeule commented 8 years ago

The only my problem is either I didn't remove reference to download_torrent() from docs or I rushed to include this change in that ill-fated PR.

You've broke user search plugins. This is the first problem. You've disallowed users to fix them because the required API is absent. This is the second one.

God knows when you finish your superb search system. Maybe you will not finish it at all. Maybe it will lack some features of the old system which you obviously don't use. What shall we do until that system become operational? Do you want to personally find a way to fix each user search plugin? Right now even your cookies manager is not mainlined. Doesn't it mean that all downloads which require cookies can not be handled by the qBt core right now?

The rule of thumb is very simple: do not mainline "will be fixed in the future" things if they break an existing functionality.

I can't name the plugins that we use now. It is something foreign, almost not integrated with the application and having external dependencies, which is comparable in size with the application itself

I can't agree with you. Their integration level is sufficient, they do the job very well. They are small:

$ du -sh nova3/
144K    nova3/

They are powerful, since all the python capabilities are at my disposal, and external dependencies is not a problem for me, since my system have many Python based packages (starting from the package manager itself) anyway. You see, this experience varies from user to user.

Okay, don't cry. I'll return it as I have time for this.

This is what I meant above.

aleqx commented 8 years ago

I completely agree with @evsh and I'm surprised by your attitude, @glassez. You also didn't try your so called solution and appear to either ignore or just not pay attention to the info posted here, which you kept requesting. Specifically:

you must parse and set the cookies every time (or else they lock you out with a captcha).

i.e. your so called "solution" won't work after a few requests, you get locked out.

Regardless, what you are advocating (without justification) is just not a solution. It's not even a working workaround for this specific site.

return from plugin not only torrent link, but also some additional parameters like headers, post params and cookies and use these data to download torrent file.

That's both a bad idea and harder for everyone ... e.g. there can be javascript than needs parsing. Please read my posts! All search code for a specific site, from the first request up to fetching of the .torrent/magnet, should be contained by the plugin. This was the previous design and it was sane. Yours isn't and it's not working.

Why did you change it? Why not just revert the code to call nova2dl when downloading a search result? Why do you insist on this counter-productive and failing logic?

zeule commented 8 years ago

I mean to return from plugin not only torrent link, but also some additional parameters like headers, post params and cookies and use these data to download torrent file.

I think the key question here is what the searhc plugin is? Or what you want it to be?

If it is an external program (python script, or it can be any other script or executable) then what jobs a plugin should do (i.e. what is its interface). If it is a script then what is the scripting engine? Do you want to have an external or internal scripting engine? And so on.

Could you describe, please, what type of integration do you want to achieve? Perhaps it would be better to open a separate issue for this discussion (if not opened already)?

glassez commented 8 years ago

Could you describe, please, what type of integration do you want to achieve? Perhaps it would be better to open a separate issue for this discussion (if not opened already)?

In the next few days I'll finish development of the first test version of this and will publish a request where we can discuss this.

Their integration level is sufficient, they do the job very well. They are small

I can only agree with the second statement. They work really well (by themselves). But integration with the app there is no - we just run a third-party application (nova2) and read its output. They (plugins) are really small but they required python distro to be installed.

aleqx commented 8 years ago

That's just insane. See my post above. Your solution won't work. Also, nobody complained about needing python (which you can just bundle by the way), but complained about you breaking what was working just fine. Now you want to change everything and everyone who's ever written a plugin must jump in unison and rewrite all their code? Awesome. Does nobody else see how twisted this is and why a simple revert is the clear answer here?

zeule commented 8 years ago

But integration with the app there is no - we just run a third-party application (nova2) and read its output.

Not a third party, but just an external application. Anyway what is wrong with it?

but they required python distro to be installed.

Does this mean you want to bundle a script engine in the qBt instead? Perhaps a phantomjs installation too?

glassez commented 8 years ago

That's just insane. See my post above. Your solution won't work. Also, nobody complained about needing python (which you can just bundle by the way), but complained about you breaking what was working. Now you want to change everything and everyone who's ever written a plugin must jump in unison and rewrite all code? Awesome. Does nobody else see how twisted this is and why a simple revert is the clear answer here?

I have already answered. I do not intend to make idle chatter here.

aleqx commented 8 years ago

Sorry, but you haven't already answered and it's still unclear whether you understand what the problem actually is and why your proposed solution doesn't work.

Does this mean you want to bundle a script engine in the qBt instead? Perhaps a phantomjs installation too?

Just bundle python instead (it's not like it's too large ... phantomjs is 18 MB)

glassez commented 8 years ago

Just bundle python instead (it's not like it's too large ... phantomjs is 18 MB)

for comparison, qBittorrent 3.3.4 windows distro has 16.35 MB size.

aleqx commented 8 years ago

for comparison, qBittorrent 3.3.4 windows distro has 16.35 MB size.

And? ...

zeule commented 8 years ago

Here on Gentoo: $ qsize qbittorrent net-p2p/qbittorrent-9999: 16 files, 32 non-files, 6021.791 KiB

glassez commented 8 years ago

And? ...

Don't you think that having plugins engine that weighs more than the app itself - it's overkill? But this is not the main thing for me. I still seek greater integration. Plugins must run in the context of the application (within reason, of course), and not separate from it. They should use the same cookies, the same auxiliary code. Etc.

zeule commented 8 years ago

Do you understand the capabilities of the standalone Python program? How are you going to implement anything comparable in the qBt? And for what sake?

Maybe it is simpler to rename 'plugin' into e.g. 'external search add-on' and that is it?

zeule commented 8 years ago

I like this idea more and more: you implement your plugins and leave the add-ons to us.

aleqx commented 8 years ago

Don't you think that having plugins engine that weighs more than the app itself - it's overkill?

No. And your question is taken out of context. We're talking tens of MB already. No, I certainly don't think 30 MB is overkill ... it's the 21st century.

I still seek greater integration. Plugins must run in the context of the application (within reason, of course), and not separate from it. They should use the same cookies, the same auxiliary code. Etc.

This is the exact sort of "decision" that is causing havoc. There is no justification for it, but there is clearly justification against it. What was wrong with how it was before? Why must plugins run in the context of the app? Why would you want to write a new self-contained scripting system, which is bound to be limited feature-wise (unless you bundle something)? Also, If they must run in the context of the app, which is C++, then I for one don't even want to touch it. I write C/C++ a lot, for when I need C/C++. Python and the like were invented for a reason.

You should look up the concept of the Rube Goldberg machine.

glassez commented 8 years ago

you implement your plugins and leave the add-ons to us.

Well, now it is done as a complete replacement of the current "plugins", but what happens in the end - nobody knows.

Do you understand the capabilities of the standalone Python program? How are you going to implement anything comparable in the qBt? And for what sake?

I figured you (or someone else) will say something similar. Do you understand the power of the nuclear bomb? But we walk with a gun hunting! :) What I mean is that we don't need it all!

aleqx commented 8 years ago

Do you understand the power of the nuclear bomb? But we walk with a gun hunting! :) What I mean is that we don't need it all!

That is evidence enough for me to unsubscribe from this thread and move on. My apologies for starting it.

glassez commented 8 years ago

That is evidence enough for me to unsubscribe from this thread and move on. My apologies for starting it.

I have already answered you.

Okay, don't cry. I'll return it as I have time for this.

zeule commented 8 years ago

@glassez, this project is not your playground

glassez commented 8 years ago

so, If they must run in the context of the app, which is C++, then I for one don't even want to touch it. I write C/C++ a lot, for when I need C/C++. Python and the like were invented for a reason.

"run in the context of the app" doesn't mean "written on C++"! You invent something and then argue with it!

@glassez, this project is not your playground

And I'm not playing! Do not mistake only he who does nothing.

glassez commented 8 years ago

And by the way, this change existed in the form of a PR within a few months. Where were you all?

zeule commented 8 years ago

The question is not whether you did a mistake or not. Why do you ignore that a) your workaround does not work and b) that your users want this broken functionality to get fixed instead of arguing why they don't need it.

And by the way, this change existed in the form of a PR within a few months. Where were you all?

I regretted a couple of times that I did not prevented it already by myself. :) I was busy, as you can see in the history of my own PRs.

But, again, the problem is not in this particular mistake. Why had you not just agreed to fix it? With the problem fixed the discussion about the search plugins future would be much more pleasant for everyone, I believe.

glassez commented 8 years ago

The question is not whether you did a mistake or not. Why do you ignore that a) your workaround does not work and b) that your users want this broken functionality to get fixed instead of arguing why they don't need it.

But, again, the problem is not in this particular mistake. Why had you not just agreed to fix it? With the problem fixed the discussion about the search plugins future would be much more pleasant for everyone, I believe.

Am I talking with a wall?! I said, and then quoted himself. I'm doing it again:

I'll return it as I have time for this.

Or you obscure the meaning of this phrase? It means that I will fix it! And let's finish this talk.

DoumanAsh commented 8 years ago

I'm not going to take sides here. But small comment about python size. For windows we can use embeddable distribution instead of full install. This distribution is precisely for needs of current qBt's search engine

sledgehammer999 commented 8 years ago

Wow I stopped paying attention a few days and all hell broke loose. I merged the PR that broke this thinking that managing the downloads centrally was a pretty good idea. Plus, I had merged "download manager" fixups/cleanups very recently and that manager is very good. So I thought we can benefit from this. Plus, cookies management will become much better when I finally merge the pending PRs.

Apparently this breaks a lot of expected behavior and I kinda see why these search plugins need to be external to the main app. They're basically separate programs and can be used to implement a very wide area of parsing sites. And this capability very seldomly can be accomplished without a scripted language.

Unless I am missing something, the issue here is that plugins aren't used to download the torrent/magnet but qbt instead. If I am correct then there are 2 big categories of plugins. The first category contains plugins that don't need special parsing for downloading torrents and the 2nd category contains plugins that need special parsing. My proposal, although I am not familiar with the search API, is this. How qbt currently decides to use X url for downloading? Does it get it from the plugins output? Then how about plugins that need special parsing return a special value instead of a url. This special value will indicate to qbt "hey don't try to download it yourself, use instead my special download function". If this creates even more problems, then disregard it.

@aleqx I hope you didn't really unsubscribe. I agree that this needs fixing. Btw, this feature was merged in v3.3.2 so you can use v3.3.1(although its stability in genral might not be great).

zeule commented 8 years ago

Then how about plugins that need special parsing return a special value instead of a url. This special value will indicate to qbt "hey don't try to download it yourself, use instead my special download function"

+1 as of now, but want to see what @glassez will propose as his further step. Perhaps there will be a possibility to split communication with search engines onto two classes: those which can be described declaratively (a search url plus a set of regular expressions, for example) and those which require custom actions.

glassez commented 8 years ago

Perhaps there will be a possibility to split communication with search engines onto two classes: those which can be described declaratively (a search url plus a set of regular expressions, for example) and those which require custom actions.

I didn't go that far in my fantasies!.. But to have a "declarative" search engine is an excellent idea. I think it will be enough for many search sites. But I pursue other goals now. As to custom download feature, @sledgehammer999 described it quite accurately. It is not REALLY required for most plugins (even if the author thinks otherwise and he's too lazy to learn at least the official plugins). So, if you really need it, then the plugin should be able to inform, and app is to allow the plugin to download the torrent yourself.

sledgehammer999 commented 8 years ago

OK, so I guess @glassez are you working on something that will mitigate this?

glassez commented 8 years ago

I'm working on a new built-in (non-python) plugin engine. This will more closely integrate the plugins with the application. They will be run like a regular function (in a separate thread). They will use the qBittorrent built-in features to do its work. They will have easy access to the debug output and logging. If necessary, they can even show the dialogs, etc. WIP PR with basic features implemented will be soon.

glassez commented 8 years ago

For those who misses old downloading way, I reverted it in #5085.

Although it will not save @aleqx. @aleqx How are you going to pass cookies (and what else you want) to download_torrent() function? Save it to a file and then read? Or do something more advanced that super fancy Python will allow you to do?

nbusseneau commented 8 years ago

Hello guys, I just wanted to add my support to the solution proposed by @sledgehammer999, namely:

Then how about plugins that need special parsing return a special value instead of a url. This special value will indicate to qbt "hey don't try to download it yourself, use instead my special download function".

As already stated in the discussion, and also #4788, some trackers (notably private ones that require authentication) have very specific needs when it comes to accessing torrent files or magnet links, whether it be in the form of simple cookies, POST parameters, or whatever JS-gated-stuff the trackers could come up with.

So I do think having both a simple/straightforward method to retrieve torrents (like in official plugins) and a separate, more complicated one (but letting you do whatever you need for private trackers) is the way to go.

EDIT: @glassez

@aleqx How are you going to pass cookies (and what else you want) to download_torrent() function? Save it to a file and then read? Or do something more advanced that super fancy Python will allow you to do?

I may have misunderstood what you're asking but if you're looking for a way to use cookies in Python this is very easily done through setting up a session, and then you use that session for each request. Here is an excerpt from the rutracker.org plugin:

self.cj = cookielib.CookieJar()
self.opener = build_opener(HTTPCookieProcessor(self.cj))
response = self.opener.open(self.login_url, urlencode(dict_encode(self.credentials)).encode())
glassez commented 8 years ago

I may have misunderstood what you're asking but if you're looking for a way to use cookies in Python this is very easily done through setting up a session, and then you use that session for each request.

Ok. I understand: Python has its own cookies storage.

nbusseneau commented 8 years ago

Yes. Even better than that actually, it imitates a regular user session, so that means I don't have to store specific cookies in a file or something: the cookies are automatically created/modified as asked by the server when executing HTTP requests.

Not sure what you were planning with DownloadManager and cookies, but on #4788 you stated this:

DownloadManager now supports cookies. But some improvements are still not fused (see #4670). So soon we will not need to use these "dances with a tambourine" (I mean separate python download routine) to download torrents with private trackers.

I may again be mistaken but it seems like these "global cookies" aren't set up automatically? If yes, that's not very useful, because having end users settings cookies manually to access private trackers is a big no-no, especially since they would have to do it again every time the cookies change! In some cases (like rutracker.org) that would be unusable because they have a habit of randomly changing cookies very often, notably by having a cookie you could describe as a "session hash" that is sent to you by the server, and cannot be computed, hence the need to simulate a real browser connection.

I don't use them for rutracker.org but there are other things you can do with Python sessions to simulate a real browser: they also handle setting request headers (notably User-Agent and Referer) and basic/digest HTTP authentication.

All of these might be required when dealing with private trackers and their "anti-robot" mechanisms, so we would need a bunch of additional hooks for an hypothetical DownloadManager to imitate real browsers and be able to handle any private trackers.

glassez commented 8 years ago

I may again be mistaken but it seems like these "global cookies" aren't set up automatically?

Global cookies are auto managed by DownloadManager. Some sites have one-to-one correspondence between user credentials and the cookies, so it is easier to add these cookies manually once, than to imitate the login process.