mozilla / geckodriver

WebDriver for Firefox
https://firefox-source-docs.mozilla.org/testing/geckodriver/
Mozilla Public License 2.0
7.03k stars 1.51k forks source link

Is it possible to control Tor? #2046

Open ottogutierrez opened 1 year ago

ottogutierrez commented 1 year ago

Hello,

Is it possible to control Tor browser with the geckodriver?

I have tried to open it listing the tor binary location (in options) but it opens the browser and doesn't control it.

whimboo commented 1 year ago

Could you please attach a trace-level log from geckodriver? It could help us to identify the issue. Basically geckodriver should be able to. Thanks!

whimboo commented 1 year ago

I've tried yesterday myself and as it looks like Marionette and Remote Agent (WebDriver BiDi) get both enabled but they seem to get stuck somewhere. It's also hard to tell where exactly given that I'm not able to get any valid log output with remote.log.level=Trace set.

I think that there might be some conversation needed with the creators of the Tor browser first.

This wouldn't be on our priority list but if you could reach out we are happy to assist.

juliandescottes commented 1 year ago

FWIW, I could successfully connect using a WebDriver BiDi client to the latest alpha release from Tor (12.0a2) which is based on ESR102. I managed to connect via websocket, create a session, open a tab etc... so it seems to work fine. The regular release is still based on ESR91 so we won't be able to do much with BiDi with this one.

whimboo commented 1 year ago

I think the question here was more about Marionette due to the usage of geckodriver. Good hint anyway to use a later build of Tor. I wonder if Marionette works as well in the 102 ESR alpha release. Could you run a check for that @juliandescottes?

juliandescottes commented 1 year ago

Just tried, and I could successfully connect with geckodriver & use marionette. However I also managed to connect to Tor 11.5.2, so maybe there is a configuration issue that both you and the reporter face, and which doesn't occur on my machine?

ottogutierrez commented 1 year ago

Thanks for your help @juliandescottes! As I'm new to this, as a workaround, launched Firefox and used the proxy that Tor creates.

Would you be able to share how you got it to work with Tor?

Thank you!

whimboo commented 7 months ago

See also https://github.com/SeleniumHQ/selenium/issues/13092 where someone got Tor working with Firefox.

MatzFan commented 7 months ago

@ottogutierrez if you are still interested in how to control Tor Browser (TB) with geckodriver I am happy to share the following: The key to getting it working lies in

  1. setting the binary location (obviously) and less obviously;
  2. setting the TB-specific preferences that are set when a user connects to the tor network for the first time

I figured out 2) by comparing the contents of the prefs.js file in the default profile dir (at tor-browser/Browser/TorBrowser/Data/Browser/profile.default) from a fresh download of TB with its contents after you connect to the tor network for the first time. The critical ones seem to be:

{ 
  'extensions.torlauncher.prompt_at_startup' => false,
  'torbrowser.settings.bridges.builtin_type' => '',
  'torbrowser.settings.bridges.enabled' => false,
  'torbrowser.settings.bridges.source' => -1,
  'torbrowser.settings.enabled' => true,
  'torbrowser.settings.firewall.enabled' => false,
  'torbrowser.settings.proxy.enabled' => false,
  'torbrowser.settings.quickstart.enabled' => true
}

The last of these is what gets set when the user ticks the checkbox 'Always connect automatically' on TB's startup page.

Aside from that the only other thing I had to do was wait for a tor network connection before returning an instantialized driver. This is necessary because if you try and browse to a page before the tor network connection is made Selenium throws an error. I achieved this via the UI by parsing the word 'Connected' form TB's 'about:preferences#connection' page. Not ideal, but this happens in the background before a driver object is returned, so until I find a better method..

My Ruby gem is here. There is also a well established Python library here. Hope this helps.

whimboo commented 7 months ago

Thank you @MatzFan! This looks indeed promising! Setting the preferences manually for now seems to be ok, but maybe we could get the TOR browser to add those prefs on top of the recommended preferences for Firefox. Not sure how they patch, but that could be an option to make it directly work without any custom pref fiddling.

MatzFan commented 7 months ago

Not sure how they patch

Nor me, but that would be ideal. I have no affiliation with the Tor Project and am just a Ruby hacker trying to get this to work, but would be happy to collaborate with you geckodriver folks to get Tor Browser supported. Ideally you'd want to work directly with the Tor Browser devs, if possible. If I find out how the patch is done I'll share this info with you.

Getting the driver working is one thing, but my aim is to ensure a Selenium-driven TB has the same browser fingerprint as the fingerprint generated by a regular (non-automated) TB user. Part of the solution is replicating the env vars set in TB's start-tor-browser script, which runs before TB is started, but that is not the whole story and presently the fingerprints differ (I use the demo FingerprintJS site to check). That file you link to is helpful, as I need to rule out the geckodriver prefs being the cause of this.

whimboo commented 7 months ago

Yes, lets wait until we want to get in contact with their developers until a proper solution was found. As you stated some issues are still remaining.

Beside the recommended preferences file that I already referred to there are also preferences that geckodriver itself sets before starting Firefox. Those are needed as early as possible because they are read only once during startup and cannot be changed while Firefox is running, but requires another restart of the application.