shaikhsajid1111 / twitter-scraper-selenium

Python's package to scrap Twitter's front-end easily
https://pypi.org/project/twitter-scraper-selenium
MIT License
299 stars 46 forks source link

Proxy #59

Open Andreito95 opened 1 year ago

Andreito95 commented 1 year ago

Authenticated proxy doesn't load correctly, if i check "whatismyipaddress" on the driver i get my real IP

shaikhsajid1111 commented 1 year ago

What browser it was? Google Chrome Or Mozilla Firefox? Can you please share more details?

Andreito95 commented 1 year ago

it's the same, proxy (with authentication user/pass) doesn't work with both Chrome or Mozilla. I have 10000 rows of python code with implemented Selenium browser, the only way to user proxy with authentication is Chrome and use an extension with a json file like this (you can find code around the web):

pluginfile = 'proxy_auth_plugin.zip' with zipfile.ZipFile(pluginfile, 'w') as zp: zp.writestr("manifest.json", manifest_json) zp.writestr("background.js", background_js) chrome_options.add_extension(pluginfile)

I am trying to find a better way (with both chrome or mozilla, i don't care) that works without exstentions (i would like to disable extensions but i can't because of this one for proxy on chrome). Another thing i can tell you is that it's not possible to save cookies with Chrome with this extension, because it will give problems with proxy authentication, so i am actually using Chrome without cookies and with this extension to use it with proxy with user/pass auth.

I tried your code and i found lot of good and perfect funcitons, but if you can find a solution for the proxy with authentication it will be really helpful to me. Tried both mozilla and chrome with your code and put right string "user:pass@ip:port", browser opens correctly but if you open "whatismyipaddress.com" it will show you the real IP (and not the proxy one), so it doesn't work.

shaikhsajid1111 commented 1 year ago

I think you're right about this. Selenium doesn't have proper functionality when it comes to using an authenticated proxy. Even I encountered the same problem when I once used an authenticated proxy, yeah that problem is available to use some chrome extensions. I think it is the only option even I see is working. If I find some better solution somewhere I would love to implement it here but currently, I don't have one.

I think if your proxy provider provides whitelisting feature then you should consider using it, just whitelist the server although it doesn't really focus on the problem it does gives you an alternative way.