dtrungtin / actor-booking-scraper

Actor for extracting data about hotels from Booking.com
https://apify.com/dtrungtin/booking-scraper
Apache License 2.0
17 stars 19 forks source link

Actor does not get all results from the booking website. #74

Open nicolaefilat opened 1 year ago

nicolaefilat commented 1 year ago

I am trying to scraper information using this actor but it does not give all the information available on booking for one url. The url given to the actor is this one

Booking says that there are 1571 results on that page.

The input given to the actor is the following

{
  "currency": "USD",
  "debug": true,
  "extendOutputFunction": "($) => { return {} }",
  "language": "en-us",
  "maxPages": 100,
  "minMaxPrice": "0-999999",
  "proxyConfig": {
    "useApifyProxy": true
  },
  "scrapeReviewerName": false,
  "search": "BRASOV",
  "simple": true,
  "sortBy": "class_asc",
  "startUrls": [
    {
      "url": "https://www.booking.com/searchresults.en-us.html?ss=Bra%C5%9Fov%2C+Brasov%2C+Romania&ssne=Boto%C5%9Fani&ssne_untouched=Boto%C5%9Fani&efdco=1&label=gen173nr-1FCAEoggI46AdIM1gEaMABiAEBmAExuAEXyAEM2AEB6AEB-AEDiAIBqAIDuALB94ukBsACAdICJDUxMmRhZWJjLWNlMjItNDNiYi05OGQzLWRiZGY1YmZiMTU5MNgCBeACAQ&aid=304142&lang=en-us&sb=1&src_elem=sb&src=searchresults&dest_id=-1153613&dest_type=city&ac_position=0&ac_click_type=b&ac_langcode=en&ac_suggestion_list_length=5&search_selected=true&search_pageview_id=84f548bb140f009a&ac_meta=GhA4NGY1NDhiYjE0MGYwMDlhIAAoATICZW46BmJyYXNvdkAASgBQAA%3D%3D&group_adults=2&no_rooms=1&group_children=0&sb_travel_purpose=leisure"
    }
  ],
  "testProxy": false,
  "useFilters": true, // this should get more than 1000 results according to documentation
  "destType": "city",
  "propertyType": "none",
  "checkIn": "",
  "checkOut": "",
  "rooms": 1,
  "adults": 2,
  "children": 0,
  "maxReviews": 25
}

Unfortunately, the actor only returns 795 results.

The apify run is here if you want to inspect the configuration in more detail.

I have also noticed that the same configuration of the actor can give different results for consecutive runs. Why can that happen?

Thank you in advance for looking into this issue.