DocNow / diffengine

track changes to the news, where news is anything with an RSS feed
MIT License
177 stars 30 forks source link

unexpected archive.org response: name 'url' is not defined #19

Closed larsschwarz closed 7 years ago

larsschwarz commented 7 years ago

Looking at diffengine.log I noticed the following error:

2017-01-19 15:25:06,606 - root - ERROR - unexpected archive.org response for https://web.archive.org/save/http://www.presseportal.de/pm/58964/3539208: name 'url' is not defined

Opening the same URL in a browser just loads archive.org fine and it returns the saved URL.

Not sure this is just a temporary error due to connection speed or similar issues, or a bug with my PhantomJS install?

edsu commented 7 years ago

Thanks for this. Can you tell what version of diffengine you are using, or when you installed it? There have been a lot of changes in the past few days.

larsschwarz commented 7 years ago

Oops, of course, missed all technical infos by accident: just cloned it like 30 minutes ago (0.0.31), running Ubuntu 16.04, PhantomJS 2.1.1

larsschwarz commented 7 years ago

Doesn't seem to be sporadic due to connection issue of stuff, just experienced the same error again:

2017-01-19 15:59:16,799 - root - ERROR - unexpected archive.org response for https://web.archive.org/save/http://www.presseportal.de/pm/123030/3539388: name 'url' is not defined 2017-01-19 15:59:55,944 - root - INFO - shutting down: new=7 checked=28 skipped=0 elapsed=0:01:18.460580

edsu commented 7 years ago

Ahh, I see it now. It looks like the Internet Archive isn't returning the archived version of the webpage and then a bug in the logging is causing an exception.

edsu commented 7 years ago

I just released the (hopefully) fixed version to PyPI as v0.0.32. Can you give it a try? You should be able to update with a:

pip install --upgrade --process-dependency-links diffengine