ChrisMuir / Zillow

Zillow Scraper for Python using Selenium
162 stars 75 forks source link

is_displayed() always False for zsg-pagination-next element. #1

Closed nightcat closed 7 years ago

nightcat commented 7 years ago

Great repo, Chris!

Unfortunately I can only get one page of output (26 homes) for each zip code.

Problem appears to be with the "driver.find_element_by_class_name('zsg-pagination-next').is_displayed()" query in get_html(). It always returns False for me.

I've tried the versions of selenium and chromedriver you suggested and several other newer versions using Python 3.5.2 on both Windows 7 and Ubuntu 16.04. Same problem in both cases.

I'm assuming you are not seeing the same issue. Any suggestions?

-Rick

ChrisMuir commented 7 years ago

Hi Rick,

Thanks for bringing this to my attention and for the detailed message. I tried a few different searches and I simply cannot recreate this issue (I'm on Windows 7). I should have some free time this weekend to play around with it, I will look into it further.

One thing that may be a quick fix for you, try replacing both instances of: driver.find_element_by_class_name('zsg-pagination-next') with driver.find_element_by_class_name('off') Replace for both .is_displayed() and .click(), both within function get_html(). I just tested it, it worked great for me. If it works for you as well, I will probably add it to the code, to work as a back up to zsg-pagination-next.

nightcat commented 7 years ago

Hi Chris!

I tried both 'off' and 'zsg-pagination-next' with chromedriver 2.25 and 2.28 on both Windows 7 and Ubuntu 16.04. They seemed to be interchangeable, but in all cases "is_displayed()" still returned False.

Then I replaced chromedriver with the Firefox geckodriver v0.14.0 on Ubuntu. That worked consistently well with 'zsg-pagination-next' (yeah!) but inconsistently with 'off' - though only for zip code 92260 where it would only output one page of listings. Of all the zip codes I tried manually only zip code 92260 had a full complement of 500 or more (520 in fact) listings, so I'm guessing that has something to do with the inconsistency with 'off'.

With it working for me using the geckodriver feel free to close this issue unless you'd like for me to help you keep digging into why is_displayed() for the 'zsg-pagination-next' consistently returns False for me with chromedriver on both operating systems.

-Rick

ChrisMuir commented 7 years ago

Hi Rick,

I'm glad you got it up and running, it not working with Chromedriver is still odd though. I'll definitely keep this open for now, as I'm planning on working with this some over the weekend, I've never used geckodriver so I'll download that. I'll report back if I have more insight. If you keep using the code and discover a fix, please comment here, open a new issue if necessary, or even submit a pull request.

And again, thank you for the bringing this up and the detailed feedback, it's much appreciated!

nightcat commented 7 years ago

Hi Chris!

I will definitely keep working with it and hopefully be able to figure out why is_displayed() keeps failing for me with chromedriver. With geckodriver working I'm thinking of trying to dig into the exposed element properties to see if any differences turn up. I'll let you know if I find out anything.

-Rick

ChrisMuir commented 7 years ago

Sounds great, thanks Rick!

ntextreme3 commented 7 years ago

If the chromedriver instance isn't wide enough zillow window hides results. I'm just maximizing the window on init to get around this.

def init_driver(filepath):
    # Gets around https://github.com/ChrisMuir/Zillow/issues/1
    options = webdriver.ChromeOptions()
    options.add_argument("--start-maximized")

    driver = webdriver.Chrome(executable_path=filepath, chrome_options=options)
    driver.wait = WebDriverWait(driver, 10)
    return(driver)
ChrisMuir commented 7 years ago

Thanks for posting ntextreme3.

@nightcat I was able to recreate your issue using phantomJS, and tried specifying a larger window size like ntextreme3 suggested, and the issue was fixed. Can you try this window size fix with chromedriver to see if it works for you?

nightcat commented 7 years ago

Yes - this works for me for chromedriver on both Linux and Windows. Thank you @ntextreme3!

Turns out that with Firefox - although the screen was not maximized - enough of the reduced-size page was shown such that all of the is_visible() tests worked.

I also found that on Linux all you have to do is call driver.maximize_window(), which works for both chromedriver and geckodriver (Firefox). Unfortunately calling driver.maximize_window() creates an error with chromedriver on Windows.

ChrisMuir commented 7 years ago

Great.

@ntextreme3, please feel free to submit a pull request if you'd like.

Thanks everyone!

ChrisMuir commented 7 years ago

No PR yet...@nightcat if you're interested in submitting a pull request for this issue, please feel free to do so.

ntextreme3 commented 7 years ago

Yeah sorry about that. It looks like the what I was mention is platform dependent. Haven't been able to look if there's a better version this week

ntextreme3 commented 7 years ago

Or maybe not... I may have read nightcat's comment wrong hahah