Open LambertWM opened 1 year ago
I spoke too soon - in order for the scrolling and fetching more tweets to work, a small delay has to be added to the scroll function as well:
def scroll_down(driver) -> None:
"""Helps to scroll down web page"""
try:
start = time.time()
body = driver.find_element(By.CSS_SELECTOR, 'body')
for _ in range(randint(2, 4)):
body.send_keys(Keys.PAGE_DOWN)
time.sleep(random.uniform(0.2, 0.3))
print("scroll_down took " + str(time.time()-start));
except Exception as ex:
logger.exception("Error at scroll_down method {}".format(ex))
@staticmethod
def wait_until_completion(driver) -> None:
"""waits until the page have completed loading"""
try:
state = ""
start = time.time()
while state != "complete":
time.sleep(random.uniform(0.1, 0.2))
state = driver.execute_script("return document.readyState")
print("wait_until_completion() took " + str(time.time()-start));
except Exception as ex:
logger.exception('Error at wait_until_completion: {}'.format(ex))
this leads me to believe that wait_until_completion( )
doesn't really do what it suggests. An alternative strategy, which has worked for me in the past, could be to send a PAGE_DOWN, then wait a little bit and to keep doing this as long until the document height has changed more than a certain amount (or a time out is reached).
I found that we spend 90% of the time in
wait_until_completion( )
, because the delay valuetime.sleep(randint(3, 5))
is 3 to 5 seconds, which seems very high - why is that?time.sleep(random.uniform(0.1, 0.2))
seems more than enough for my simple tests, but maybe I'm missing something?