flathunters / flathunter

A bot to help people with their rental real-estate search. 🏠🤖
GNU Affero General Public License v3.0
831 stars 179 forks source link

Any way to avoid/delay the ImmoScout24 plus-only properties? #329

Open marcodallagatta opened 1 year ago

marcodallagatta commented 1 year ago

As the title says, it would be nice to have an option to filter out IS24 Plus only ads, and maybe delay that notification after 24h, if it's still up? (Unlikely, I know)

codders commented 1 year ago

Hi @marcodallagatta ,

You might need to provide a bit more details about the dynamics of IS24 posts.

marcodallagatta commented 1 year ago

Hi @codders, so:

Thank you for considering this.

codders commented 1 year ago

Okay. I had a quick look. For ImmoScout listings, we actually extract the data direct from the Javascript model - we don't scrape the DOM. You can see the loop here that processes the IS24.resultList object (visible from the Javascript Console in your browser):

https://github.com/flathunters/flathunter/blob/main/flathunter/crawl_immobilienscout.py#L102-L107

IS24 Plus adds have the property privateOffer: "true" set. If you do entry.get("privateOffer", "false") == "true" here:

https://github.com/flathunters/flathunter/blob/main/flathunter/crawl_immobilienscout.py#L109-L140

that will allow you to detect those offers. You just need to return None in that case from extract_entry_from_javascript, and filter the Nones out of the comprehension in get_entries_from_json.

Pull requests welcome! :)

marvinsxtr commented 1 year ago

Hi,

privateOffer actually seems to be the flag indicating whether the offer is from a private person or from an agency.

However, the /expose page of each offer contains very detailed info on the premium content:

restrictedListing: {
  restrictedEndTime: "2 d 10 h",
  paywallBuyPlusListing: {
    enabled: false,
    isPaywallActive: true,
    premiumProfileRequired:false
  },
  exclusivePresaleListing: {
    enabled: false,
  },
  exclusivePresaleBasicListing: {
    enabled: true
  }
},