Open data1111 opened 6 years ago
hi, same problem here!
It appears that AirBnb no longer sends a JSON with the necessary information. In order to make it work now you will have to update the locator to get the information from the HTML (using XPATH or CSS selectors). Also you will have to use Splash since some of the elements are not loaded if requested by Scrapy only.
I'm having this same issue. I have no idea how to implement the update that idedaandre suggests. Any help would be awesome.
me too... this is the output
instant_book,satisfaction_guest,rating_checkin,bed_type,person_capacity,accuracy_rating,rating_communication,room_type,hosting_id,url,amenities,rev_count,cancel_policy,rating_cleanliness,nightly_price,host_id,response_rate,price,response_time ,,,,,,,,,https://www.airbnb.com/rooms/993348?location=greece,,,,,,,,, ,,,,,,,,,https://www.airbnb.com/rooms/661755?location=greece,,,,,,,,, ,,,,,,,,,https://www.airbnb.com/rooms/2107937?location=greece,,,,,,,,, ,,,,,,,,,https://www.airbnb.com/rooms/659712?location=greece,,,,,,,,, ,,,,,,,,,https://www.airbnb.com/rooms/17428493?location=greece,,,,,,,,, ,,,,,,,,,https://www.airbnb.com/rooms/15064259?location=greece,,,,,,,,, ,,,,,,,,,https://www.airbnb.com/rooms/10983314?location=greece,,,,,,,,, ,,,,,,,,,https://www.airbnb.com/rooms/3455118?location=greece,,,,,,,,, ,,,,,,,,,https://www.airbnb.com/rooms/526402?location=greece,,,,,,,,, ,,,,,,,,,https://www.airbnb.com/rooms/2610077?location=greece,,,,,,,,, ,,,,,,,,,https://www.airbnb.com/rooms/283638?location=greece,,,,,,,,, ,,,,,,,,,https://www.airbnb.com/rooms/5283277?location=greece,,,,,,,,, ,,,,,,,,,https://www.airbnb.com/rooms/2670085?location=greece,,,,,,,,, ,,,,,,,,,https://www.airbnb.com/rooms/14349663?location=greece,,,,,,,,, ,,,,,,,,,https://www.airbnb.com/rooms/7027819?location=greece,,,,,,,,, ,,,,,,,,,https://www.airbnb.com/rooms/12783254?location=greece,,,,,,,,, ,,,,,,,,,https://www.airbnb.com/rooms/1192594?location=greece,,,,,,,,,
What I would recommend is using Splash + Scrappy (if you google splash with scrappy there should be enough documentation on how to set it up properly). After you setup, splash+scrappy then use CSS selectors to get the data in the pages, since there's no longer a convenient .json to pull the data from.
Hopefully, this can help the setup:
https://github.com/scrapy-plugins/scrapy-splash
https://blog.scrapinghub.com/2015/03/02/handling-javascript-in-scrapy-with-splash/
Cheers
Hi there,
First of all, thanks for developing this code.
I'm having trouble with scrapy and the json items. I got it to scrape the pages I wanted and when I open the csv file it only comes with the urls, not the other items... What do you sugest?
Cheers