joeyism / linkedin_scraper

A library that scrapes Linkedin for user data
GNU General Public License v3.0
1.97k stars 551 forks source link

Chrome throws (maybe bot detection?) #88

Open albertoZurini opened 3 years ago

albertoZurini commented 3 years ago

image

I'm currently using an undetected chromedriver https://github.com/ultrafunkamsterdam/undetected-chromedriver

joeyism commented 3 years ago

Uh... this has never happened to me before. Can you paste your code?

albertoZurini commented 3 years ago
from linkedin_scraper import actions, Person

import chromedriver_hidden as uc # (using selenium webdriver throws the same as well)

options = uc.ChromeOptions()
options.add_argument('--incognito')
options.add_argument('--no-sandbox')

driver = uc.Chrome(executable_path="/usr/local/bin/chromedriver", options=options)

try:
    actions.login(driver, email, password) # if email and password isnt given, it'll prompt in terminal
except Exception as e:
    print("Exception! Reload the page or manually login; press enter when done")
    input()
person = Person("https://www.linkedin.com/in/andre-iguodala-65b48ab5", driver=driver) # this doesn't get executed due to the chrome error
albertoZurini commented 3 years ago

I found this network request tack image

copy as curl -> run doesn't return anything while curl https://www.linkedin.com/li/track returns GOOD, maybe this request has some sort of bot detection

joeyism commented 3 years ago

because we're using selenium, the page should fully load including track if it runs. Can you run it without the chrome options? They may detect bot with incognito and no sandbox mode.