ropensci / weathercan

R package for downloading weather data from Environment and Climate Change Canada
https://docs.ropensci.org/weathercan
GNU General Public License v3.0
102 stars 29 forks source link

Amount of data accessible with this pkg? #80

Closed sckott closed 5 years ago

sckott commented 5 years ago

👋 as part of preparing an rOpenSci annual report, we're trying to estimate amount of data the various pkgs in our suite provide access to.

Do you have a sense for how much data (e.g., in GB) one can access through this pkg? And whatever metric is most relevant for this data?

steffilazerte commented 5 years ago

An extremely rough estimate would be ~ 110 GB (I had fun, if you're interested: http://ropensci.github.io/weathercan/articles/articles/data.html). Considering this is text data (so small), I think that's pretty impressive!

As for metric, well perhaps years would be the most interesting? Or time in general? For this data, ignoring the climate normals, we have up to 397,671 years worth of data (summed across different stations and within stations at different time scales).

All of these values are possibly an overestimate, as they're calculated from the start and end years of a station, which may not represent a whole year, and which don't account for gaps in the data record.

Is this something of what you were looking for?

sckott commented 5 years ago

Thanks very much, that works! We're going for rough estimates here since it's just for informational purposes, thanks for taking the time on this!

What is the earliest year that data is present? So we can get a span

steffilazerte commented 5 years ago

http://ropensci.github.io/weathercan/articles/articles/data.html

Earliest year was 1840 in Toronto! That station didn't close down until 2017! For reference, there were 365 stations which were active prior to 1900. Actually a pretty fun exercise!

sckott commented 5 years ago

great, thanks very much @steffilazerte ! and i'm glad it was fun 😸