DocNow / diffengine

track changes to the news, where news is anything with an RSS feed
MIT License
177 stars 30 forks source link

A couple of README updates. #2

Closed ruebot closed 7 years ago

ruebot commented 7 years ago

Fixed a couple links, and added some.

I was going to add a N.B. about symlinking diff.html if you encounter `FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/lib/python3.5/dist-packages/diff.html', but maybe I'll just throw myself at the code and help out if you want.

edsu commented 7 years ago

I'll fix the diff.html, I think I know what the problem is. I just need the time to do it :-)

That's awesome that you got so many working! Did you see how the archive urls for the toronto star don't appear to be working? It looks like they may have some javascript that detects there is a newer version of the page available maybe? Is that possible?

ruebot commented 7 years ago

torstar; yeah. I noticed that. I've been trying to figure out what they're doing. Torstar and Globe & Mail do some really interesting things with their sites which is starting to get surfaced with there. Older newspapers transitioning to digital platforms is a really interesting area.

Somewhat related, I think the feed is similar what wapo insofar as we are seeing articles being created before they are actually "published". At least that's my initial suspicion.

...I have a few more in the works too. If they start tweeting, I'll add them.

...also, I have Thursday and Friday blocked off as research days. This code base really fascinates me, so I'm going to read it really close. Happy to help out where I can if you need a hand, but also don't want to get in the way. ...and I'm supposed to be working on extending this https://github.com/ukwa/webarchive-discovery :smile: