flathunters / flathunter

A bot to help people with their rental real-estate search. 🏠🤖
GNU Affero General Public License v3.0
830 stars 179 forks source link

kleinanzeige vorübergehen gesperrt #510

Closed czrty closed 8 months ago

czrty commented 9 months ago

Hi everyone, it is great work, thank you. i made a mistake and used it in two different folders with different urls, both filters (urls) are for kleinanzeige. Since two program runs simultaneously, although time limit was 600, it called website more often than this....

I got the feedback, IP area is temporary blocked. Problem is, although i changed the IP( used my phone as hotspot got different IP), it keeps blocking. I am not IT expert, but, can we assume that, website doesnt block the IP-Bereich but the some parameters crawler has; header, fingerprinting or some other informations? If this is the case, how can it be fixed? Thank you.

codders commented 9 months ago

Hi @czrty ,

Thanks for your comment and the issue. I don't know exactly how the blocking works. We're using the requests-random-user-agent library to change the user agent for each request, and if you're changing your IP address then the only common elements are the request headers and of course the URL. You could experiment to see if a different search URL gets around the blocking, or you could check to see if you can access the page normally in your web browser.

Relax594 commented 9 months ago

It is not about user-agents or proxies. Manually accessing the page is working, so it is probably something with the request headers. Other repos are facing the same issue atm.

codders commented 9 months ago

Okay. Then it might be time to switch Kleinanzeigen over to use the headless browser that we have for Immoscout. That should make it much harder to detect.

czrty commented 9 months ago

it is still blocked :)

codders commented 9 months ago

But on the bright side, still accepting pull requests!

czrty @.***> schrieb am Mo., 4. Dez. 2023, 22:25:

it is still blocked :)

— Reply to this email directly, view it on GitHub https://github.com/flathunters/flathunter/issues/510#issuecomment-1839503745, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAEK5XQ3CA4PQND37QJVADYHY5WHAVCNFSM6AAAAAA72ASQ5OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMZZGUYDGNZUGU . You are receiving this because you commented.Message ID: @.***>

czrty commented 9 months ago

do you know, is it planned to make changes in kleinanzeige.py?

codders commented 9 months ago

Yes, I mean - I would love to do that, and I don't think it takes very long. I am very busy right now and don't have time to make the change, but anyone who wants to look at the Immoscout scraper and try and implement the same functionality in the Kleinanzeigen scraper is very welcome to do so.

czrty commented 8 months ago

is it still recommended to have 600 seconds wait time for kleinanzeige, iif t uses headless browser like scout?

codders commented 8 months ago

That's hard to say. If they have changed their bot detection, we won't know what the new rules are until we try them. If you're not worried about being blocked for a couple of days, you could reduce the time and see what happens.

Do you have code that's ready for review? Do you want to push your branch up to your fork and make a pull request so that I can take a look?

Thanks,

Arthur

czrty @.***> schrieb am So., 10. Dez. 2023, 09:54:

is it still recommended to have 600 seconds wait time for kleinanzeige, iif t uses headless browser like scout?

— Reply to this email directly, view it on GitHub https://github.com/flathunters/flathunter/issues/510#issuecomment-1848899451, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAEK5UGZ43Y37QB3MK2WOTYIV2FTAVCNFSM6AAAAAA72ASQ5OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNBYHA4TSNBVGE . You are receiving this because you commented.Message ID: @.***>

czrty commented 8 months ago

Hi Artur, i will not take the risk, they dont block only for few days but permanently :)

i ljust earnt python last week to see if i can make update the code, i fitted it like on the scout but, i am not sure if it is good code to merge.

Beside that, i spent days to learn python and make update, i dont want someone be greedy and set time so low get us blocked again, it would be painful after investing that much time:)

codders commented 8 months ago

Hey @czrty ,

I see you closed this ticket - can you share your modified code?

codders commented 8 months ago

Merged in #511 from @Dimfred. This seems to fix the issue. I've deployed an updated image to the hosted flathunter instance. Thanks @Dimfred!