vinc3PO / ebayKleinanzeigenAlert

Telegram Alert for ebay kleinanzeigen posts
MIT License
57 stars 15 forks source link

"webpage fetching error for url" #39

Open Kamran12646 opened 9 months ago

Kamran12646 commented 9 months ago

Hello dear community, Hello dear vinc3po, Since 5 p.m. I've been getting the following error with the ebAlert bot: "webpage fetching error for url:" Has the HTML structure of Kleinanzeigen really been changed again in this short time or is it due to my Raspberry Pi 4?

Hallo liebe Community, Hallo lieber vinc3po, seit 17 Uhr bekomme ich beim ebAlert Bot den folgenden Fehler: "webpage fetching error for url:" Wurde die HTML-Struktur von Kleinanzeigen in dieser kurzen Zeit nochmals abgeändert oder liegt es an meinem Raspberry Pi 4?

makedamnsure commented 9 months ago

Same for me @Win10 since today.

Pelicana commented 9 months ago

Same issue here :) running locally on windows, so it shouldn't be your raspberry pi.

Kamran12646 commented 9 months ago

Did anyone find a solution yet?

Pelicana commented 9 months ago

nope, would love to know too.

DanielZ3108 commented 8 months ago

Same issue here @ Intel NUC (Proxmox)

henengel commented 8 months ago

same for me on MacOS..

vinc3PO commented 8 months ago

Here is the hidden message: 'In deinem IP-Bereich kam es vor Kurzem mehrfach zu unsicheren Versuchen, unsere Plattform zu verwenden. Dies kann auch durch andere Personen erfolgt sein. Daher wurde dieser IP-Bereich zur Vorbeugung von Betrug zeitweilig von der Nutzung von Kleinanzeigen ausgeschlossen. Bitte versuche es später erneut.' It seems that they are implementing some sort of anti-scalping security. I'll investigate if this can be bypassed somehow.

I'll keep you updated

Araibona commented 8 months ago

Gibt's was neues ?

max49944 commented 8 months ago

leider noch nichts

alafad commented 7 months ago

I've tried to change the User Agent Header, but unfortunately it doesn't was a solution..

max49944 commented 7 months ago

I don't think there will be a solution anymore, unfortunately I think it's been put on ice. Unfortunately no one cares about it

workinghard commented 7 months ago

Best advice is to fetch less frequently and random.

alafad commented 7 months ago

I don't think there will be a solution anymore, unfortunately I think it's been put on ice. Unfortunately no one cares about it

I think there is not really a bug, it should be working again with other headers and in case a proxied requests by high frequently usage.

vinc3PO commented 7 months ago

Indeed, the problem is not the Header. If you try with the same ip address with similar header in Postman it goes through. However, with python requests is does not. I have a old version that still works, but the new ones does not. I tried downgrading to the version that works without success. I have been quite busy to fully investigate the error. My next try is to try without the requests library. As I believe that might be the problem, that ebay recognise it as a python as assume that is a bot.

To be continued...

svenisda commented 6 months ago

Hey man, did you found any solution to bypass the detection? In the past i made my own monitor which worked, I wanted to reactivate it today but only 403 and the same message.

I personally use discord for notifications and a simple proxy function to monitor multiple urls at once maybe this is a good functions for your one too.

Would share my code but it is really shitty because I’m not a good coder 😄

Zippochonda commented 6 months ago

I would also appreciate a solution, is there any way to help?

svenisda commented 6 months ago

I would also appreciate a solution, is there any way to help?

I will try today other scraping methods instead of bs4 if I find one I will reply. Maybe we all can connect an built a super Kleinanzeigen monitor 😁

vinc3PO commented 6 months ago

Good news everyone,

it seems that if you update the latest requests and urllib it would work. requests=2.31.0 urllib3=2.2.0

Not sure how long it will take ebay to shut it down again. But for the time being it should work

pip install requests urllib3 --update

Let me know if that works for you as well.

beleza-pura commented 6 months ago

For me it is still not working after the update of requests and urllib3. I was also experimenting with different headers, but no solution so far.

beleza-pura commented 6 months ago

I've been investigating a little. Looks like they introduced Akamai Bot Protection. Mainly there are two things keeping the bot from working properly:

⚠️ FingerprintJS detected: 
https://static.kleinanzeigen.de/static/js/top.yt20r2l2bahn.js

⚠️ Akamai detected: 
https://www.kleinanzeigen.de/akam/13/435259e7

More information about this bot protection you can find here. My quick workaround was to use the trial version of ZenRows API service to bypass anti-bot protection. But since in the future I will have to pay for it, I'll try to figure something out on my own.

svenisda commented 6 months ago

Found a very good workaround using playwright and Chromium. I can use it as a headless browser so it is working in the background and you don't have to pay any Akamai solving stuff.

I have some contacts which selling Akamai, px and other apis but why paying when playwright works 😊

makedamnsure commented 6 months ago

it seems that if you update the latest requests and urllib it would work.

For me this fixes the problem at my working machine (Win10 Home) but unfortunately not at my thin client (Win10 IoT LSCT 21H2).

Zippochonda commented 6 months ago

Good news everyone,

it seems that if you update the latest requests and urllib it would work. requests=2.31.0 urllib3=2.2.0

Not sure how long it will take ebay to shut it down again. But for the time being it should work

pip install requests urllib3 --update

Let me know if that works for you as well.

It doesn't work for me (RPi3), i used pip install --upgrade requests urllib3 to update. But i get still the webfetching error

 sudo python -m ebAlert links -a https://www.kleinanzeigen.de/s-dreibaum/k0
>> Adding url
<< webpage fetching error for url: https://www.kleinanzeigen.de/s-dreibaum/k0
<< Link and post added to the database
DanielZ3108 commented 6 months ago

It worked with this command on a Proxmox Container:

It doesn't work for me (RPi3), I used pip install --upgrade requests urllib3 to update. But i get still the webfetching error

Bot is currently working again

Thanks!

vinc3PO commented 6 months ago

Bad News Everyone!!!

As noted by @beleza-pura it seems that ebay is starting a fight against bots.

I've been investigating a little. Looks like they introduced Akamai Bot Protection. Mainly there are two things keeping the bot from working properly:

⚠️ FingerprintJS detected: 
https://static.kleinanzeigen.de/static/js/top.yt20r2l2bahn.js

⚠️ Akamai detected: 
https://www.kleinanzeigen.de/akam/13/435259e7

More information about this bot protection you can find here. My quick workaround was to use the trial version of ZenRows API service to bypass anti-bot protection. But since in the future I will have to pay for it, I'll try to figure something out on my own.

In this case, the problem is not the Akamai bot protection the problem as the requests library can't perform those Javascript challenges. However, this means they are actively trying to stop us from using bots. It seems that they have invested money and have tools that analysis traffic, learn from it and block what they suspect is a bot. That means that it is a matter of time before the new updated library get blacklisted and blocked again.

If you start using selenium or other library using virtual browser, then the Akamai bot will start learning your bot behaviour and eventually block it.

What does it mean?

It means that this simple bot will be soon archived. To counter the akamai bot a much larger project must be undertaken where the browser activity have to be randomized to act like human.

Thank you all!

yamanatoo commented 6 months ago

This currently works. But as you mentioned it might be countered in the future. https://github.com/vinc3PO/ebayKleinanzeigenAlert/pull/41

tchleb commented 2 months ago

Since yesterday 12:20 i got the "webpage fetching error for url" errror. I tried #41 but it doesn't work. @yamanatoo does your fix still work for you?

This are the error message:

Starting Ebay alert Processing link - id: 1 - link: https://www.kleinanzeigen.de/s-test/k0 2024-06-08 21:07:10,903 - get_session in ebAlert.crud.base - ERROR - Message: session not created: Chrome failed to start: exited normally. (session not created: DevToolsActivePort file doesn't exist) (The process started from chrome location /snap/chromium/2873/usr/lib/chromium-browser/chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.) Stacktrace:

0 0x55f7bce8e63a

1 0x55f7bcb8f65c

2 0x55f7bcbc3c95

3 0x55f7bcbbff8f

4 0x55f7bcc099a4

5 0x55f7bcbfd313

6 0x55f7bcbcd586

7 0x55f7bcbcdefe

8 0x55f7bce57b7f

9 0x55f7bce5bd0a

10 0x55f7bce459dc

11 0x55f7bce5c491

12 0x55f7bce2b7ee

13 0x55f7bce7dc28

14 0x55f7bce7de36

15 0x55f7bce8d6f1

16 0x7f174c46eac3

ERROR:ebAlert.crud.base:Message: session not created: Chrome failed to start: exited normally. (session not created: DevToolsActivePort file doesn't exist) (The process started from chrome location /snap/chromium/2873/usr/lib/chromium-browser/chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.) Stacktrace:

0 0x55f7bce8e63a

1 0x55f7bcb8f65c

2 0x55f7bcbc3c95

3 0x55f7bcbbff8f

4 0x55f7bcc099a4

5 0x55f7bcbfd313

6 0x55f7bcbcd586

7 0x55f7bcbcdefe

8 0x55f7bce57b7f

9 0x55f7bce5bd0a

10 0x55f7bce459dc

11 0x55f7bce5c491

12 0x55f7bce2b7ee

13 0x55f7bce7dc28

14 0x55f7bce7de36

15 0x55f7bce8d6f1

16 0x7f174c46eac3

<< Ebay alert finished