cvandeplas / pystemon

Monitoring tool for PasteBin-alike sites written in Python. Inspired by pastemon http://github.com/xme/pastemon
GNU Affero General Public License v3.0
334 stars 226 forks source link

Pastebin Pro Feed Crashing #94

Closed timhux123 closed 5 years ago

timhux123 commented 5 years ago

Anyone else having issue's with your pastebin pro account? All was working successfully for a few weeks and then I noticed AIL was not receiving any paste from my pastebin pro account. Other paste are downloading successfully (slexy.org, kpaste.net, codepad.org, gist.github.com) I have triple checked and my IP is whitelisted on Pastebin's site.

My pastebin pro configuration in pystemon.yaml:

pastebin.com_pro: archive-url: 'https://scrape.pastebin.com/api_scraping.php?limit=250' archive-regex: '"key": "(.+)",' download-url: 'https://scrape.pastebin.com/api_scrape_item.php?i={id}' public-url: 'https://pastebin.com/raw/{id}' update-max: 50 update-min: 40

The following errors over and over until it reaches 100 then crashes. It does eventually recover on its own but crashes again after 100 tries.

[2018-10-23 21:15:08,671] Failed to download the page because of other HTTPlib error proxy error https://scrape.pastebin.com/api_scraping.php?limit=250 trying again. [2018-10-23 21:15:08,671] Retry 99/100 for https://scrape.pastebin.com/api_scraping.php?limit=250 [2018-10-23 21:15:08,718] Failed to download the page because of other HTTPlib error proxy error https://scrape.pastebin.com/api_scraping.php?limit=250 trying again. [2018-10-23 21:15:08,719] Retry 100/100 for https://scrape.pastebin.com/api_scraping.php?limit=250 [2018-10-23 21:15:08,875] Thread for pastebin.com_pro crashed unexpectectly, recovering...: 'NoneType' object has no attribute 'text'

Here is the error when running "./pystemon.py -v":

[2018-10-23 21:34:46,930] Retry 99/100 for https://scrape.pastebin.com/api_scraping.php?limit=250 [2018-10-23 21:34:46,930] Downloading url: https://scrape.pastebin.com/api_scraping.php?limit=250 with proxy: None and user-agent: None [2018-10-23 21:34:47,039] Failed to download the page because of other HTTPlib error proxy error https://scrape.pastebin.com/api_scraping.php?limit=250 trying again. [2018-10-23 21:34:47,039] Retry 100/100 for https://scrape.pastebin.com/api_scraping.php?limit=250 [2018-10-23 21:34:47,453] Thread for pastebin.com_pro crashed unexpectectly, recovering...: 'NoneType' object has no attribute 'text' [2018-10-23 21:34:47,464] Traceback (most recent call last): File "./pystemon.py", line 127, in run last_pasties = self.get_last_pasties() File "./pystemon.py", line 147, in get_last_pasties htmlPage = response.text AttributeError: 'NoneType' object has no attribute 'text'

rommelfs commented 5 years ago

Hi there,

have you tried to access from the same machine where pystemon is running:

curl "https://scrape.pastebin.com/api_scraping.php?limit=250"

Pastebin recently has enabled IPv6, if your machine is dual stack, you might see what the problem is after running the command above.

I hope this helps, Sascha

timhux123 commented 5 years ago

Thanks for the info rommelfs.

I'm closing this issue as the problem is between my ip address and pastebin. It appears pastebin is blocking my ip address even though it's whitelisted on their site. I confirmed by whitelisting a different ip address which is not being blocked and pastebin_pro data is being downloaded successfully.

cvandeplas commented 5 years ago

thanks for the feedback

timhux123 commented 5 years ago

FYI - Here is the response I received back from Pastebin concerning my whitelisted IP which was blocked:

"Try now, we've removed the block. But make sure you only hit the scraping API end points."

I'm able to download paste using my pro account after the block was removed.