BCCN-Prog / weather_2016

For the BCCN 2016 advanced programming project
3 stars 1 forks source link

Scraper sanity checks don't allow forecasts more then 20 days in future #59

Closed denisalevi closed 8 years ago

denisalevi commented 8 years ago

Accuweather downloads html_pages with data up to 90 days in the future. The test_scraper_output.py raises an AssertionError if the daily dictionary keys are >=20. Please fix that to >=90 or make an exception for accuweather. link to code

clauslang commented 8 years ago

The question is, do we really want a forecast that far in advance? It's going to blow up our database and run time for scraping data integration. Maybe we should only hand as many days to the DB as the site with the smallest forecasting period has (as only this is relevant for comparative analysis - or are there other analysis goals for which we would need that much data?). I think that would be 10, which also corresponds to the example dictionary in the wiki.

clauslang commented 8 years ago

Let's discuss this in the next session. But yeah, if we restrict the number of days of prediction, this should probably happen on the DB side when integrating.