seleniumbase / SeleniumBase

📊 Blazing fast Python framework for web crawling, scraping, testing, and reporting. Supports pytest. Stealth abilities: UC Mode and CDP Mode.
https://seleniumbase.io
MIT License
5.48k stars 989 forks source link

Website detects undetected bot #2589

Closed jorisstander closed 8 months ago

jorisstander commented 8 months ago

I have been trying to log in on a website but it keeps detecting me. This is what I do:

I create a driver and open google with it. From there I log in manually to my google account so that I have cookies on the profile. Next step I go to the website where I want to log in, but whenever i try logging in it detects me as a bot. Besides creating the chrome instance everything is manually.

I tried doing the exact same on my actual chrome and there I can log in without a problem. I also changed my IP and I still encounter detection. It seems like it knows i spun up an instance with seleniumbase but how is that possible?

Things i tried: Remove the user_dir folder, Change user-agent, Change IP, Created new google profile with cookies, Checked recaptcha score which was 0.9.

If anyone has any suggestions I would love to hear them!

mdmintz commented 8 months ago

Sounds like you're trying to avoid detection after clicking a button to log in to a site. You need to use the sb.driver.uc_click(selector) method in order to remain undetected.

Try these examples that use it:

jorisstander commented 8 months ago

I thought that was the issue at the start too. So I tried doing everything manually except for starting up the browser instance. So going to the website is manually -> filling in the information and clicking the log in, but still it knows i'm using a bot instance. So 0 interaction with a script and the target website. So it made me think it's maybe the settings of the browser which make it flagged as bot..

mdmintz commented 8 months ago

It's not so much your browser actions that get you detected, but mostly whether Selenium is connected to your browser when a website service is looking for it. UC Mode methods disconnect Selenium from the browser at specific times, and then reconnect after a few seconds (customizable).

If you want to perform manual action while in UC Mode, use this:

sb.driver.uc_open_with_reconnect(url, "breakpoint")

(Type c and press Enter to continue from the breakpoint.)

jorisstander commented 8 months ago

I did not know that. Thank you, it worked!

Tongcheng commented 8 months ago

Thanks for the explanation @mdmintz , I'm still amazed and don't fully understand why "reconnect" would work when CF is heavy on a website. Is it due to certain "port" being occupied when selenium is connected to browser and that's how CF detects it (and the reason reconnect works)?

mdmintz commented 8 months ago

@Tongcheng reconnect() is a combination of three things:

After the WebDriver Service is stopped, Selenium is no longer attached to the web browser, and because of that, website services that try to detect Selenium won't find it.

Tongcheng commented 8 months ago

Great explanation @mdmintz ! I am curious if you know fundamentally (deeper inside chrome + selenium), how does detector able to tell "this browser is attached to selenium vs. not"?

Tongcheng commented 8 months ago

In other words @mdmintz , I am curious how you come up with idea, because when I did research, all evading direction seems to be "browser fingerprinting" etc., I didn't think it's some more fundamental attribute other than fingerprinting which could be detected

mdmintz commented 8 months ago

There's an explanation here already: https://github.com/seleniumbase/SeleniumBase/issues/2213

Tongcheng commented 8 months ago

Thanks a lot for the pointer! @mdmintz