Climate-Data-Science / Climate-Similarity-Metrics

Which similarity metrics are the most helpful to understand climate
0 stars 2 forks source link

Create a preprocessing pipeline for combining the data on one level #4

Closed pawelbielski closed 4 years ago

pawelbielski commented 4 years ago

Data for every year has around 200MB, so analyzing multiple years at once would massively increase the need for computing power and memory. It is important to prepare a smaller dataset for just one of the 37 levels that would be 37 times smaller than original one.

In the existing repository that the climate scientists prepared for us, there already exist a dataset with all the years and averaged levels with just 18MB. Its description, together with an original data source can be found in the Readme.md file there. The author of the Readme.md uses a data manipulation tool "cdo", so it might be an idea to explore that approach as well.

Steps:

pawelbielski commented 4 years ago

Ok, so there was no need to write the pipeline for that because you used the cdo tool. Great idea!