edlongman / thescoop

The fastest way to catch up on recent news
thescoop.io
Apache License 2.0
6 stars 0 forks source link

Not enough data for BBC - are there any archives we can scrape? #25

Open jake-patt opened 11 years ago

edlongman commented 11 years ago

Annoyingly the BBC has no indexed archives so it would be very difficult and unreliable to scrape them. On 12 Aug 2013 11:06, "bleko" notifications@github.com wrote:

— Reply to this email directly or view it on GitHubhttps://github.com/edlongman/thescoop/issues/25 .

jake-patt commented 11 years ago

We might have to set a limit on the date range then (maybe 2 months). Adding future news sources is going to be a pain otherwise

Taiiwo commented 11 years ago

I really think we should start logging as many news feeds as we can as soon as possible. Even if we had a year of the BBC logged, it still wouldn't be usable if we're only logging 4 feeds.

edlongman commented 11 years ago

I can't work on the site for the next 3 weeks so can you do that On 12 Aug 2013 19:01, "Taiiwo" notifications@github.com wrote:

I really think we should start logging as many news feeds as we can as soon as possible. Even if we had a year of the BBC logged, it still wouldn't be usable if we're only logging 4 feeds.

— Reply to this email directly or view it on GitHubhttps://github.com/edlongman/thescoop/issues/25#issuecomment-22512478 .

jake-patt commented 11 years ago

Are we able to do that on your server ed? Do we need passwords or anything?

edlongman commented 11 years ago

Tyler knows how the php script works so he can do that. No you do not need passwords. Also only add bbc feeds for now On 12 Aug 2013 19:50, "bleko" notifications@github.com wrote:

Are we able to do that on your server ed? Do we need passwords or anything?

— Reply to this email directly or view it on GitHubhttps://github.com/edlongman/thescoop/issues/25#issuecomment-22515876 .