Closed ejsimley closed 2 years ago
Merging #172 (9ead7a0) into develop (c038ca5) will decrease coverage by
1.30%
. The diff coverage is52.34%
.
@@ Coverage Diff @@
## develop #172 +/- ##
===========================================
- Coverage 70.77% 69.46% -1.31%
===========================================
Files 23 24 +1
Lines 1591 1716 +125
===========================================
+ Hits 1126 1192 +66
- Misses 465 524 +59
Impacted Files | Coverage Δ | |
---|---|---|
operational_analysis/types/reanalysis.py | 36.36% <28.00%> (-6.07%) |
:arrow_down: |
...tional_analysis/toolkits/reanalysis_downloading.py | 58.25% <58.25%> (ø) |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update c038ca5...9ead7a0. Read the comment docs.
From a scientific point of view this looks great, @ejsimley! I have no issues to report.
The only discussion point I have is whether it makes more sense to add a function (or include this in the reanalysis_downloading) to calculate wind direction and air density instead of keeping them as a necessary step to have in the project script. But totally a very minor point, I am fine either way!
@RHammond2, @nbodini the latest commits should address your comments from back in August. I ended up revising quite a bit, so let me know if this looks good or if there are any more changes you'd like to make.
Rob: In addition to your inline comments, I added leading underscores to some of the internal/helper function in the reanalysis toolkit and made a unit test file for the reanalysis downloading toolkit to test a couple of the helper functions.
Nicola: Based on your suggestion, I added the option to automatically derive wind speed and direction as well as air density from the available reanalysis variables. There's an option for this in the reanalysis_downloading
toolkit function download_reanalysis_data_planetos
and also a method to compute these variables in the reanalysis
data type class, called compute_derived_variables
.
I made another change based on some feedback from Patrick Duffy, who did some "beta testing" of the toolkit. Instead of requiring the user to provide a dictionary mapping the PlanetOS variable names to desired variable names, I added the argument var_names
to download_reanalysis_data_planetos
so users can just specify the PlanetOS variable names if they don't care about renaming them.
Thanks for your hard work on this, Eric, this looks great.
Side note: I also got the same little fail in the test for check_simulation_results_gbm_daily while I was working on the AEP filter PR.. not sure what this was caused by, but I don't think it's related to your PR specifically.
Hey @RHammond2, I addressed your last comments. Let me know if you think this looks good or if any of the additions should be modified.
Triggering CI/CD pipeline
This pull request adds a module for downloading reanalysis data in the toolkits directory, called reanalysis_downloading. Currently, this module supports downloading MERRA-2 and ERA5 data using the PlanetOS API, but some documentation is provided explaining how to download data directly from NASA and Copernicus too. Documentation in the module explains how to obtain a PlanetOS API key, which is needed to use the module, as well. A method has also been added to the reanalysis data type class that allows loading data using the PlanetOS API (via the reanalysis_downloading toolkit module) in addition to loading from CSV files.
Within the reanalysis_downloading module, the main function used to download data is called "download_reanalysis_data_planetos". It returns a DataFrame containing the downloaded data and optionally saves the data as a csv file. It's arguments include:
Note: before merging this we'll ask the PlanetOS team and NREL to review the documentation as well.
Here are some examples of using the function with the default atmospheric variables and without saving a CSV file, highlighting different ways of specifying the date range:
First, downloading the default date range, which is 20 years up to the end of the most recent full month:
Next, we'll download data for the date range from 2000-01-01 to 2005-07-15:
If we only specify the start date, it will download "num_years" of data starting on that date. Note, we're asking for MERRA-2 data now:
Similarly, if we only specify the end date, it will download "num_years" of data ending on that date.
Lastly, requesting 6 years, which by default will be the 6 years up to the end of the most recent full month: