PecanProject / pecan

The Predictive Ecosystem Analyzer (PEcAn) is an integrated ecological bioinformatics toolbox.
www.pecanproject.org
Other
202 stars 234 forks source link

Bugfixes for `model2netcdf.ED2()` #2985

Closed Aariq closed 2 years ago

Aariq commented 2 years ago

Description

Went down a bit of a rabbit hole on this one...

Fixes several bugs related to model2netcdf.ED2(). The end goal is to fix #2974, but ended up having to deal with other issues along the way.

Motivation and Context

model2netcdf.ED2() was broken

Review Time Estimate

Types of changes

Unsure if new NULL defaults for mstmipvar args will cause problems anywhere else

Checklist:

Should I add a NEWS file to PEcAn.ED2? Should I increment the version of PEcAn.ED2?

Aariq commented 2 years ago

NOTE: this adds a 16mb file for testing (replacing a ~5mb one) I don't think I can get this file any smaller (this is the size even when ED2 simulation is run for a week) and it is needed for running tests.

Aariq commented 2 years ago

Warnings from automated tests look like they're due to 3rd edition of testthat. I'll make that explicit in DESCRIPTION and work on those warnings.

infotroph commented 2 years ago

this adds a 16mb file for testing (replacing a ~5mb one) I don't think I can get this file any smaller

@Aariq (1) is this file already compressed, and if not could it be? (2) Can you say more about why the previous 5-MB file doesn't work for the updated test? CRAN has a pretty hard limit of 5 MB for the entire package, so if test cases really need to be this large then in the long run we'll likely need to find ways to store them outside the package.

Aariq commented 2 years ago

this adds a 16mb file for testing (replacing a ~5mb one) I don't think I can get this file any smaller

@Aariq (1) is this file already compressed, and if not could it be? (2) Can you say more about why the previous 5-MB file doesn't work for the updated test? CRAN has a pretty hard limit of 5 MB for the entire package, so if test cases really need to be this large then in the long run we'll likely need to find ways to store them outside the package.

Compressed it's only 473 kb, so I'll give that a try. In fact, it will simplify things to have all the output files in a .zip, then uncompress them in a tempdir for testing, then remove.

The previous file was from an ED2 run with only a single PFT. The bug I was attempting to fix was specific to output from runs with multiple PFTs. It is maybe possible to testread_E_files() and put_E_values() with an "-E-" file with 2 PFTs (334 kb) but test read_T_files() and put_T_values() with a smaller "-T-" file (old one ~5mb) from a different run. The _T_ functions don't do anything with PFTs, I think, based on their arguments. I imagine it might cause problems (or if it doesn't, it probably should) with testing the overall model2netcdf.ED2() function though if the output files were not all from the same run. I don't know enough about the contents of the .h5 files to know if they can be stripped down and still relevant for testing.

Aariq commented 2 years ago

The failing checks are for undocumented datasets that I didn't touch. I have no idea what they are so I'm not planning on addressing these warnings in this PR.

Aariq commented 2 years ago

Marking as a draft again because although bugs are fixed, I'm not sure model2netcdf.ED2 is working as intended. I can't figure out how to read in results separately by PFT. For example, in an output .nc file ncvar_get(x, "NPP") returns a single long vector, even with multiple PFTs.

Aariq commented 2 years ago

@dlebauer, I'd like to look at the resulting .nc files with you and figure out if they're correct before merging this PR.

Aariq commented 2 years ago

.nc outputs don't have PFT as a dimension for variables that probably should have that dimension, but I think this is a separate issue for a different PR, so this PR is ready now.

dlebauer commented 2 years ago

all test / check errors are 'Error: These files have changed since checkout. Check for out-of-sync autogenerated files, or for untracked files written inside the working tree'

TODO