digital-engineering / airbnb-scraper

Airbnb Scraper: Advanced Airbnb Search using Scrapy
GNU General Public License v3.0
190 stars 66 forks source link

Spider stops before finishing #22

Closed LoganD19 closed 1 year ago

LoganD19 commented 1 year ago

This may be out of my beginner knowledge. But I only get a fraction of airbnb records before it stops. For example, I get on average 250-280 records. I did a query for "Boone, NC" and just "NC" and I got about the same amount of records. But for completely different listings. I know there are more records even for the city, because when I search on Airbnb for city I get 1000+ records for Boone, NC. Any idea on how this might be fixed? I can drop any files that I am working with here if needed. This is a great script and other than this issue, it works awesome.

JovaniTarnowski commented 1 year ago

Hello, @LoganD19 Can you send me the error that you are getting on the terminal? Maybe with more information I can help.

LoganD19 commented 1 year ago

Sorry for the delay in reply. I will upload some info later this afternoon

LoganD19 commented 1 year ago

I believe I am getting an "Error 5" Here's a time lapse screen recording I took of the two commands I ran, and here are the CSV outputs of both. Stupidly I closed the terminal and I don't believe there is a log file I can pull from. Thanks for your help!

boone.csv nc_all.csv

(https://youtu.be/lRdwIfGhJJ0)

digitalengineering commented 1 year ago

Hi @LoganD19, thank you for the detailed comment!

Although Airbnb may indicate that there are 1000+ records for a given search, you will notice that it only provides 15 pages of results, each with 20 records per page. 20 x 15 = 300.

In other words, the maximum amount of results Airbnb will return for a given search is 300.

Without seeing your settings.py, it's not possible to know what other filter settings you might have in place, but 20-50 records could filtered out of the 300, giving you 250-280 records.

As a way to get around the 300 record limit, you could try systematically running more fine-grained searches, e.g. min_price=50 and max_price=60, then min_price=60 and max_price=70, then min_price=70 and max_price=80, etc.