whchien / funda-scraper

FundaScaper scrapes data from Funda, the Dutch housing website. You can find listings from house-buying or rental market, and historical data. 🏡
GNU General Public License v3.0
104 stars 48 forks source link

Increased page limit and improved postal code accuracy from PC4 to PC6 #3

Closed deKeijzer closed 1 year ago

deKeijzer commented 1 year ago

I encountered an issue with the hard limit of 999 pages in the scrape.py file. To address this, I have modified the code to set the limit to max 10k pages.

In addition, I noticed that the code only used PC4 for zip code information, even though PC6 information is available. To make use of this additional information, I have modified the preprocess.py file to allow saving 1234 AB (PC6) as a zip code versus just 1234 (PC4).

These changes will allow users to perform searches across multiple cities and provide more detailed location information. Thank you for considering my pull request.