Closed twest820 closed 1 year ago
Thanks @twest820 for the outstanding code review and pointing potentially some issues. Some of them shall be addressed back to original code of Landsberg and Waring some of them indeed need to be cross-checked with 3-PG mix.
Would be great if there will be a volunteer who can implement those suggestions and corrections.
A quick response from my side:
Based on the excel version downloaded here https://sites.google.com/site/davidforresterssite/home/projects/3PGmix, r3PG have the same implementation and coefficients:
If CanopyVolumefraction(layer) < 0.01 Then
lambdaH(speciescounter) = 0.8285 + ((1.09498 - 0.781928 * kLSweightedave(layer)) * 0.1 ^ (0.01)) - 0.6714096 * 0.1 ^ (0.01)
Else
lambdaH(speciescounter) = 0.8285 + ((1.09498 - 0.781928 * kLSweightedave(layer)) * 0.1 ^ (CanopyVolumefraction(layer))) - 0.6714096 * 0.1 ^ (CanopyVolumefraction(layer))
End If
Yes, correct. There is a room for improvement, as you also mentioned in #68
Fortran output is a four dimensional array. Fortran is not communicating the names back to R and therefore the names are assigned in R transf.out . Therefore, there is no real need to declare the unused variables in Fortran.
good suggestion.
good suggestion
There are automatic tests run under the testthat
, but agree would be nice to have better automatic testing with excel.
Much room for improvement.
Tod can you please write the new comment instead of extending the initial one, this would be easier to track for me. Thanks
! Production --------------- output(ii,:,6,2) = npp_f(:)
This is because it is assigned NPP
values for further calculation https://github.com/trotsiuk/r3PG/blob/8d378441174a86c868071b33ac7d6d408d545c8d/pkg/src/md_3PG.f95#L431
! Mortality --------------- output(ii,:,8,1) = biom_tree_max(:)
For me seems to be correct and is a time variant variable
library(r3PG)
out_3PG <- run_3PG(
site = d_site,
species = d_species,
climate = d_climate,
thinning = d_thinning,
parameters = d_parameters,
size_dist = d_sizeDist,
settings = list(light_model = 2, transp_model = 2, phys_model = 2,
height_model = 1, correct_bias = 0, calculate_d13c = 0),
check_input = TRUE, df_out = TRUE)
library(dplyr)
library(ggplot2)
sel_var <- c('biom_tree_max')
out_3PG %>%
filter( variable %in% sel_var ) %>%
ggplot( aes(date, value, color = species) ) +
geom_line() +
facet_wrap(~variable, scales = 'free') +
theme_classic()
Third, this probably isn't really an i_write_out.h issue
answered in previous comment https://github.com/trotsiuk/r3PG/issues/69#issuecomment-1048202251
There are a couple issues with this.
Yes, thanks for this. The issue was that I was checkin for NA
in all columns instead of the subset (other assumptions which you suggested were already included before). https://github.com/trotsiuk/r3PG/blob/7df6479c690515937089a11816ab1a61e7d3dae2/pkg/R/prepare_climate.R#L49
Thanks, DONE
Thanks, DONE
s_sizeDist_correct() declares but does not initialize CVdbhDistribution, CVwsDistribution, wsWeibullScale, wsWeibullShape, and wsWeibullLocation. It also does not zero them in these two branches:
For me everything works well, can you please provide reproducible example:
library(r3PG)
out_3PG <- run_3PG(
site = d_site,
species = d_species,
climate = d_climate,
thinning = d_thinning,
parameters = d_parameters,
size_dist = d_sizeDist,
settings = list(light_model = 2, transp_model = 2, phys_model = 2,
height_model = 1, correct_bias = 0, calculate_d13c = 0),
check_input = TRUE, df_out = TRUE)
library(dplyr)
library(ggplot2)
sel_var <- c('DrelBiaspFS', 'DrelBiasBasArea')
out_3PG %>%
filter( variable %in% sel_var ) %>%
ggplot( aes(date, value, color = species) ) +
geom_line() +
facet_wrap(~variable, scales = 'free') +
theme_classic()
assume age(, :) and the total basal_area are greater than zero. This is not necessarily the case, may contribute to https://github.com/trotsiuk/r3PG/issues/52,
Can you please provide a reproducible example as discussed in #52
As the necessary state management code simply isn't present I think md_3PG.f95 itself is probably adequate as a repex (I added those lines in my branch the same day as noting the issue here, which reasonably also qualifies). In my view, at least, the issue really is that timestep state management isn't object oriented. While I've moved state into its own class (and also fixed another occurrence of this particular issue with t_n
) I haven't refactored into a state.ClearSizeDist()
kind of member function. I probably should.
If the package tests did repeated runs on the r_vba_compare examples in principle they could pick up this particular issue—that's where my unit tests found it—but I suspect the call pattern through R introduces additional zeroing which masks the Fortran-level omissions. Since it's most likely a latent defect that's why it's noted in this list of minor miscellany rather than broken out as its own issue.
Hi Tom,
Yes, r3PG
eventually diverged from eh excel 3PG mix
due to implementations and further development. Therefore, I'm not sure if the further comparison with the excel version will be valuable. We also, during the previous tests, noted that some calculations (e.g calculation average temperature from minimum and maximum) will give a different results between R/Fortran and Excel (on the >6th digit) this will eventually result also in diverge calculations.
Filing this to capture a few low priority things I've noticed. These can be gotten to as someone has time. I'll add to this issue if I encounter other things
md_3PG.f95
At initialization, the lines
assume
age(, :)
and the total basal_area are greater than zero. This is not necessarily the case, may contribute to #52, and causes the initial volume_mai to be infinite for Liquidambar formosana in mixtures_other.silently overwrites parameter settings from within
s_3PG_f()
. This is unlikely to be expected and should be done in parameter input validation rather than at simulation runtime.The 3-PGpjs manual says
CoeffCond
has units of mbar, the 3-PGmix manual says mBar⁻¹, and the code omits documenting the units. Sincef_vpd
= exp(-CoeffCond
*VPD_sp
) I believe the 3-PGmix manual is correct.lambda_h
is calculated asbut the corresponding coefficients in Equations 5a and 5b of Forrester et al. 2014 are 0.8260 + (1.1698 − 0.9221 kLSweightedave) 0.1 canopy_vol_frac - 0.6703 * 0.1 * canopy_vol_frac + 0.0011 1.0807 solarAngle. Equations A21 and A22 current 3-PGmix manual have the same values as Forrester et al. It therefore looks very much like the regression has been slightly reparameterized since 2014 but corresponding updates weren't made to the 3-PGmix manual and the comments in the code.
f_gamma_dist()
hasIt's a negligible difference but it's unclear why this is 2.718282d0 ** (-x) instead of just Exp(-x).
rhoMin
andrhoMax
are misnamed and are more correctly calledrho0
andrho1
, respectively. This is an error in the 3-PGpjs manual, which naively assumes young wood (rho0
) is less dense than old wood (rho1
) even though this depends growth rates and is not the case for juvenile wood of quite a few species. Since, correctly, no check is made thatrhoMin
<rhoMax
the code should probably follow 3-the PGmix manual in usingrho0
andrho1
.water_runoff_polled
iswhich looks like a typo for
water_runoff_pooled
as polling and pooling are definitely different things. It's also unclear where this calculation comes from as it's not commented in the code and doesn't yield search hits in either the 3-PGpjs or 3-PGmix manuals.It's unclear what, other than numerical error, might cause the check
to be true since
mort_manag
andmanagementInputs(,3:5,)
are both fractions.md_decl_const.f95
The array
daysInMonth
is currently declared asand then used in light, transpiration, and δ13C GPP calculations. This is a good zero order approximation but, as February is hard coded to 28 days and there are no leap year checks, an off by one day error is introduced every leap February. From a look through the code the result is overall estimates of incoming PAR and evapotranspiration are biased low by approximately 0.07% across any given four year period (both 3-PGpjs and 3-PGmix ends assuming there's no solar radiation or ET on February 29th).
This seems likely to be a minor enough structural error to be unimportant to overall model accuracy and I presume fixing this would be problematic to backwards compatibility. However, it still seems worth capturing as an issue.
i_write_out.h
gets named npp in R rather than npp_f. npp and npp_f differ by biom_debt_foliage so the difference between the two is often substantial in deciduous leaf on months. (Otherwise, they appear to be identical.)
tends to return a time invariant value. Since biom_tree_max is calculated from the (usually) time varying stems_n_ha this is somewhat suspicious.
Third, this probably isn't really an i_write_out.h issue but, for want of a better header to put it under, the data frame returned by
run_3PG()
contains numerous unnamed variables whose names mostly appear to align with four dimensional array columns not assigned in i_write_out.h. These includevar_1_11
tovar_1_15
var_2_14
andvar_2_15
(present twice)var_3_12
tovar_3_15
(14 and 15 present twice)var_4_12
tovar_4_15
var_5_13
tovar_4_15
var_6_14
andvar_6_15
var_7_14
andvar_7_15
(present twice)var_8_5
tovar_8_15
(but notvar_8_6
),var_9_5
tovar_9_15
var_10_15
(present twice)I haven't checked if any of these contain actual data but it seems probably worth excluding unused variables from the data frame and giving informative names to any which do include data.
prepare_climate.R
In prepare_climate.R there is the check
There are a couple issues with this.
prepare_climate()
recreates and fills the column a few lines later but, strictly speaking, it should be acceptable for tmp_ave to contain imputable NAs. The fix here is modify the check and then usemutate(tmp_ave = replace_na(tmp_ave, ...))
.settings$calculate_d13c = FALSE
is passed torun_3PG()
). The workaround is again to drop the d13catm column if it's not needed.s_sizeDist_correct()
declares but does not initializeCVdbhDistribution
,CVwsDistribution
,wsWeibullScale
,wsWeibullShape
, andwsWeibullLocation
. It also does not zero them in these two branches:From a very brief look, it appears this may result in these five outputs containing whatever was in uninitialized memory when they're returned to R. This is most obvious when bias correction is disabled.
prepare_input.R
Description of
vpd_day
is that it's frost days per month.prepare_sizeDist.R
In prepare_sizeDist.R this should be
stop( 'First column name of the size_dist table must correspond to: parameter' )
.r3PG_VBA_compare.R
I haven't done a comprehensive analysis of this but I've noticed the values output from the current Fortran code diverge somewhat from the reference values in r3PG_input.xls. Due to the lack of comments in the test code it's unclear if this is expected behavior but, given the importance assigned to backwards compatibility, this may merit closer attention even though the changes are, presumably, intentional. The current test code generates a .png for visual comparison but it may be worth considering implementing automated checks. It looks like r3PG_input.xls may be excised from longer runs as it shows starting biom_foliage_debt which doesn't quite match the initial biom_foliage values specified.
63 repros in mixtures_eu with European beech volume_cum dropping from 138.53876 to 138.19902 in October 2003. Testing therefore could have detected #63 prior to the 0.1.3 release but, since the code necessary to do so didn't exist and wasn't added when #63 was fixed, there remains a test hole here.
In general, values aren't sanity checked for range. So overly negative, overly positive, infinite, and NA outputs will not be detected, both internally (due to the lack of assertions) and in the output returned to R (due to the necessary R test code not being present).
examples
The Weibull fitting examples for 3-PGmix bias correction are missing from r3PG. A partial workaround is to start from the ones distributed with 3-PGmix but it seems likely most people will find the code there difficult to adapt to their datasets.