rockdaboot / mget

Multithreaded metalink/file/website downloader (like Wget) and C library
GNU Lesser General Public License v3.0
113 stars 19 forks source link

SOCKS support for TOR usage? #31

Open zoobab opened 8 years ago

zoobab commented 8 years ago

Hi,

Is there plan to support SOCKS proxies? I had to run polipo to make a HTTP proxy out of a TOR SOCKS proxy to try to use TOR, but I had troubles to resolve a hidden service.

Here is my polipo config:

$ cat /etc/polipo/config daemonise=false diskCacheRoot=/var/cache/polipo/ proxyAddress=127.0.0.1 proxyName=localhost serverSlots=4 serverMaxSlots=8 cacheIsShared=true allowedClients=127.0.0.1 socksParentProxy = localhost:9050

rockdaboot commented 8 years ago

@zoobab Did you consider using torsocks ? Or the other way round - can you explain why torsocks wouldn't work ?

zoobab commented 8 years ago

Yes, I considered torsocks, but I want to multiplex several TOR circuits at the same time, so using multiple socks/http proxies in parallel, that's why I was interested in mget support for multiple proxies (compared to aria2 which does not have that feature). Will give it a shot for torsocks in a single proxy configuration.

rockdaboot commented 8 years ago

Mget supports multiple proxies (separated by comma). I admit, I never tested this thoroughly since I am not really using proxies. In theory, these proxies are used round-robin for each connection Mget opens. If polipo/tor work together properly, Mget should have no problem.

zoobab commented 8 years ago

"these proxies are used round-robin for each connection Mget opens"

Good to know, because I was more expecting a split of each connection between multiple proxies. A bit more like this example:

http://geekofpassage.blogspot.be/2013/08/using-multiple-proxy-servers-few-to-no.html

It would be good to mention in the documentation that it is a round-robin for each connexion.

rockdaboot commented 8 years ago

It would be easy to implement a 'load balance' in the means of 'take the proxy longest-not-in-use'. That implies that the proxy with the highest throughput will be taken more likely. Without special care, this approach has some penalties with broken/non-reachable proxies.

The current round-robin approach has the pitfall that the slowest proxy is likely to get more and more attraction and thus getting slower and slower... I guess I have to change it. Also, implementing SOCKS5 shouldn't be a big deal either - I just need some time first.