Open MarkMenagie opened 4 months ago
Was also trying out this library and had a quick look. I still get results if I use the find_past
flag, but otherwise not. To me it seems the issue is that Funda changed their CSS, because on new pages I cannot find the CSS selectors specified in config.yaml. I also noticed that on new Funda entries the ?old_ldp query parameter no longer works on new pages (I'm not sure what it means though, is it documented anywhere?). Older entries that are by now sold redirect to koop/verkocht/
pages that still have the old CSS selectors, which would explain the inconsistent results.
I see a workaround/temporary fix was recently implemented in #41 by @mpgreg that added the ?old_ldp
query param, but it seems luck has already run out.
And the link https://www.funda.nl/huur/amsterdam/appartement-43547656-jan-van-zutphenstraat-75/?old_ldp=true
redirects to https://www.funda.nl/detail/huur/amsterdam/appartement-jan-van-zutphenstraat-75/43547656/
which doesn't have the old CSS selectors.
When will this be solved? Im missing a lot of data now.
I made a patch here, feel free to pull: https://github.com/whchien/funda-scraper/pull/50
Many thanks for creating this patch! For me the patch doesn't seem to work. Not sure whether I correctly applied it: I pulled the code from the repo and only changed the config.yaml file. There is a difference in that the scraper now does recognise the selling price, but the other fields are not captured. Any ideas what could cause that? Perhaps the funda CSS has changed again?
Since yesterday evening scraper.run() results in an empty dataframe, despite that the code finds and fetches new links. When debugging I notice that self.raw_df is being filled with NA values within the scrape_pages function. It did work before, so did something change on the website or is this just me?