I have a question been trying to get posts related to a hashtag(did it with selenium, so I have links to all the posts related to hashtag for example #backyardideas) I have been trying to filter out the posts based on US
and succeeded in filtering it out using the following code:
headers = {
"user-agent": "Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Mobile Safari/537.36 Edg/87.0.664.57",
"cookie": "sessionid={0};".format(ses_id)
}
for tag_name in tags:
post_links = tags[tag_name]
for posts in tqdm(post_links):
try:
post = Post(posts)
post.scrape(headers=headers, webdriver=driver)
time.sleep(10)
if 'address_json' in post.flat_json_dict:
address = json.loads(post.flat_json_dict['address_json'])
cc = address['country_code']
if 'us' == str(cc).lower():
us_profiles.append((post.username, address))
except Exception as e:
print(e)
continue
The problem is although I have 10 seconds of delay in it for some reason, my Instagram account is getting blocked and is asking for manual verification. Any idea how could I avoid it?
Second problem is its keep throwing errors
I have a question been trying to get posts related to a hashtag(did it with selenium, so I have links to all the posts related to hashtag for example #backyardideas) I have been trying to filter out the posts based on US and succeeded in filtering it out using the following code:
chrome_options = Options() ua = UserAgent() userAgent = ua.random chrome_options.add_extension('IRM-Chrome.crx') chrome_options.add_argument(f'user-agent={userAgent}') chrome_options.add_argument("--window-size=1920,1080") chrome_options.add_argument("--headless") chrome_options.add_argument("--disable-gpu") chrome_options.add_argument("--disable-dev-shm-usage") chrome_options.add_argument("--no-sandbox") driver = webdriver.Chrome(chrome_options=chrome_options) driver.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", { "source": "const newProto = navigator.proto;" "delete newProto.webdriver;" "navigator.proto = newProto;" }) driver.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", { "source": """ Object.defineProperty(navigator, 'webdriver', { get: () => undefined }) """ })
headers = { "user-agent": "Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Mobile Safari/537.36 Edg/87.0.664.57", "cookie": "sessionid={0};".format(ses_id) } for tag_name in tags: post_links = tags[tag_name] for posts in tqdm(post_links): try: post = Post(posts) post.scrape(headers=headers, webdriver=driver) time.sleep(10) if 'address_json' in post.flat_json_dict: address = json.loads(post.flat_json_dict['address_json']) cc = address['country_code'] if 'us' == str(cc).lower(): us_profiles.append((post.username, address)) except Exception as e: print(e) continue
The problem is although I have 10 seconds of delay in it for some reason, my Instagram account is getting blocked and is asking for manual verification. Any idea how could I avoid it? Second problem is its keep throwing errors