Closed mluerig closed 6 years ago
Mortiz,
A few things to try:
Bob
On Apr 20, 2018, at 11:24 AM, Moritz Lürig notifications@github.com<mailto:notifications@github.com> wrote:
I am using 15 min interval environmental time series data to model GPP and ER over several consecutive days (a formatted sample is attached). However, I am getting some really weird results. There are these extreme jumps in ER from negative to positive (ER can't be positive, no?), and also predictions for O2-sat seems off - or is this just a labelling-error (see attached plot - 10-11%)?
mod.out.txthttps://github.com/USGS-R/streamMetabolizer/files/1933003/mod.out.txt mod.data.txthttps://github.com/USGS-R/streamMetabolizer/files/1932967/mod.data.txt
I am not quite sure where to start. I have only been playing around with "mle" models, not even sure is my data is fine the way it is. E.g. light - does it have to be smoothed more (I have been using raw)? If you have time to have a look or just point me in the right direction I would really appreciate it.
[do_pred]https://user-images.githubusercontent.com/15648068/39064767-e896c5b2-44cf-11e8-8e2c-4fc070bbdd52.png
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/USGS-R/streamMetabolizer/issues/367, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ALgqdqlBoOpZ-0O5NZ-dfWKupYcBqs2Iks5tqhm7gaJpZM4Td0Xf.
Hi Bob,
thanks for your reply.
1)/4)I really should have said that these data do not come from nature but from a dozen experimental ecosystems/mesocosms (1000 L cattle-tanks). For each tank I have five chunks that are each a week long, collected between July and October. These tanks had macrophytes and sometimes a lot of phytoplankton in them. So, for most parts of the summer they were quite productive and often supersaturated with oxygen. I don't really have any expectations for K600 - should be more similar to a lake, right?
2)/3) I did use wrong values for DO.sat. However, fixing that did not really change the diel DO changes - I would think that they are just that low (the data I provided is from October).
I had a look at your L&O paper - I am guessing that the relevant equation for my is case is 5) ? Is it somehow integrated into streamMetabolizer? I also thought more about what I should define as a "day" to calculate GPP and ER from, as so far I only used the standard values (4-28). Would it make sense to align the start/end times with when O2-sat starts to drop and then rise again (e.g., in the case of that sample data ~8:00)? Of course I would have to adjust those times throughout the season...
I'm not that experienced with calculating ecosystem metabolism, so your thoughts are highly appreciated - thanks very much.
Moritz
Hi Moritz,
Following up on 2)/3): could you share/check your new DO prediction plot against the following? you should be seeing values in that second panel much closer to 100 now that you've updated your DO.sat inputs. and it would help me to see the final data, too. .
streamMetabolizer does not have a second term for KO_over (eqn 5 from Bob's L&O paper) because we'd need additional information to tease apart ER from KOover. If the nighttime oversaturation is real, you may simply need to interpret the ER values as being ER minus the rate of O2 pushing its way into the water from bubbles. I'd like Bob's take on the comparability of bubbles from rapids and bubbles from gas getting trapped under macrophytes/algae, too - I can't decide whether they would function the same way in the model or not.
I doubt that changing the time window would have a strong effect on your results, but it shouldn't be too costly to run the model with a few different start times to find out for sure. I'd encourage you to try a few different windows for all the dates before trying to customize the window on each date.
Unsmoothed light shouldn't be a problem, though when I plotted your data, I did notice that the light peaks are occurring ~2 hrs after noon, suggesting to me that your solar.time stamps aren't precisely local solar time. This might be part of why you're needing to think about modifying the time windows?
Alison
Mortitz,
Given that you are in cattle tanks, you can ignore the O_over argument in my L&O paper. It is easy to see why supersaturion occurs in a tank— NEP > G where G is gas exchange flux K*(Osat-O).
But working in cattle tanks you have special problems not found in streams. The big one, is that unless you are physically mixing the tanks, the O2 will not be uniformly distributed in the tank. Thus it may be difficult to estimate metabolism for the entire tank if this is occurring. Streams are nice for metabolism because we assume they are mixed through any given cross-sectional area. That may not be the case in a tank.
Bob On Apr 24, 2018, at 4:31 PM, Alison Appling notifications@github.com<mailto:notifications@github.com> wrote:
Hi Moritz,
Following up on 2)/3): could you share/check your new DO prediction plot against the following? you should be seeing values in that second panel much closer to 100 now that you've updated your DO.sat inputs. and it would help me to see the final data, too. [image]https://user-images.githubusercontent.com/12039957/39216352-218f72d0-47d0-11e8-86b2-e16cb3517c2b.png.
streamMetabolizer does not have a second term for KO_over (eqn 5 from Bob's L&O paper) because we'd need additional information to tease apart ER from KOover. If the nighttime oversaturation is real, you may simply need to interpret the ER values as being ER minus the rate of O2 pushing its way into the water from bubbles. I'd like Bob's take on the comparability of bubbles from rapids and bubbles from gas getting trapped under macrophytes/algae, too - I can't decide whether they would function the same way in the model or not.
I doubt that changing the time window would have a strong effect on your results, but it shouldn't be too costly to run the model with a few different start times to find out for sure. I'd encourage you to try a few different windows for all the dates before trying to customize the window on each date.
Unsmoothed light shouldn't be a problem, though when I plotted your data, I did notice that the light peaks are occurring ~2 hrs after noon, suggesting to me that your solar.time stamps aren't precisely local solar time. This might be part of why you're needing to think about modifying the time windows? [image]https://user-images.githubusercontent.com/12039957/39217346-6b29d85a-47d4-11e8-90c2-09004b2fb5c3.png
Alison
— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/USGS-R/streamMetabolizer/issues/367#issuecomment-384100685, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ALgqdgqOTOPllKNt5AxzlS6W2BmwA8WDks5tr6fLgaJpZM4Td0Xf.
Hi Alison and Bob,
Thanks for your replies. Here is the updated model in- and output:
@aappling-usgs: This is what the updated predictions for the example look like (I previously had entered DO.sat as % and not mg/L). O2 sat looks right now.
I played around with different starting times: in my loop that runs through all the data-chunks I am now selecting for the hour of the day before the first sunlight is detected. In most cases this greatly improved the results, especially for ER. However, sometimes I will get just NAs as output, and when I then change the start time a little in one or the other direction it works. I guess I won't get around specifying manual starting dates.
Independent of the start time, sometimes the predicted values are not great (e.g. in day 1, 2 and 4 of this example, the peaks are "chopped off") and I would like to try to improve the fit. Do you think specifying different initial values would help (as you show in the gpp_er_eqs tutorial), or even using a different model?
About the solar.time: why is it important to use solar time instead of local time? Shouldn't time really depend on what you define as a day, i.e. starting and end times?
Another thing: I noticed that the last day in my time series is always not modeled, I assume because that day is "not complete" (the hours after midnight to complete the window are missing"). Is there any way to solve this?
@robohall: I was afraid you were going to say this. I guess there is no way to really account for this? We put the sondes that measured water parameters in the middle of the tanks (most of the bubbles were on the walls). I always assumed that because it is a relatively small volume, local differences might be existent, but negligible compared to the strong diurnal patterns across all tanks.
Cheers Moritz
Oh, and is there a way to extract a goodness of fit? This might be a way (for me and in general) to narrow down the daily window selection if one could just pick a range, and then decide by the gof which suits best.
That's interesting that the start times are affecting your results strongly - thanks for sharing that result. I'm curious about the causes of the NAs - you can probably find explanations in the final columns of predict_metab(mm)
or get_params(mm)
(I assume you're still using MLE observation-error models?).
Peaks being "chopped off" can sometimes be a result of a mismatch between timestamps of the DO and light data. I can vaguely imagine that this phenomenon could also be explained by a bad estimate of K600, or maybe something about the incomplete mixing or very slow gas exchange in the cattle tank, but my thinking on that is pretty hazy right now. Maybe @robohall can do better?
About solar.time - for you, because you have your own light data, it's only important because it determines where in the daily DO cycle the date windows sit. It's more important if people are modeling rather than measuring light, because then light is a function of time. As noted above, the temporal match between the DO and light curves is important to get GPP right.
Yes, the last day in your time series is likely not being modeled because that day is not complete. A 24-hour period is required to model each day. One option, if you're OK using some hours of data twice, is to use an earlier time window for the final day such that you can fit a full 24 hours in before your data run out.
Goodness of fit is not a foolproof approach because of equifinality in metabolism models, but you could compute RMSE of the DO predictions like so:
library(dplyr)
predict_DO(mm) %>%
mutate(err_sq = (DO.mod - DO.obs)^2) %>%
group_by(date) %>%
summarize(RMSE = sqrt(mean(err_sq)))
But remember that for process-error and state-space models, a close match between observed and predicted oxygen concentrations is not the [only] objective; those models also try to match the [DO] rates of change and have different definitions of DO.mod.
Other ways of assessing model output:
nlm
function. See ?nlm
for definitions of these columns.get_fit(mm)
for the full suite of metrics.Another thing to consider is priors on K600. I am not sure how you can get negative estimates of K600 with a lognormal prior. For sure set a lognormal prior, but because you are in tanks, K600 should be small.
here is how I would do it
Estimate k600 (m/d) intercept when wind = 0 from all of the windy lake k literature. Divide by tank depth to get K600. Take ln of this value as mean for lognormal prior. Set scale sufficiently large enough to allow a broad range of values, say 0.7 or 1.
Also consider pooling gas exchange among all tanks, regardless of treatment. That would give 100’s of estimates of K. Because GPP and ER are not pooled, they will remain independent and reflect treatments.
Bob
On Apr 25, 2018, at 10:05 AM, Alison Appling notifications@github.com<mailto:notifications@github.com> wrote:
That's interesting that the start times are affecting your results strongly - thanks for sharing that result. I'm curious about the causes of the NAs - you can probably find explanations in the final columns of predict_metab(mm) or get_params(mm) (I assume you're still using MLE observation-error models?).
Peaks being "chopped off" can sometimes be a result of a mismatch between timestamps of the DO and light data. I can vaguely imagine that this phenomenon could also be explained by a bad estimate of K600, or maybe something about the incomplete mixing or very slow gas exchange in the cattle tank, but my thinking on that is pretty hazy right now. Maybe @robohallhttps://github.com/robohall can do better?
About solar.time - for you, because you have your own light data, it's only important because it determines where in the daily DO cycle the date windows sit. It's more important if people are modeling rather than measuring light, because then light is a function of time. As noted above, the temporal match between the DO and light curves is important to get GPP right.
Yes, the last day in your time series is likely not being modeled because that day is not complete. A 24-hour period is required to model each day. One option, if you're OK using some hours of data twice, is to use an earlier time window for the final day such that you can fit a full 24 hours in before your data run out.
Goodness of fit is not a foolproof approach because of equifinality in metabolism models, but you could compute RMSE of the DO predictions like so:
library(dplyr) predict_DO(mm) %>% mutate(err_sq = (DO.mod - DO.obs)^2) %>% group_by(date) %>% summarize(RMSE = sqrt(mean(err_sq)))
But remember that for process-error and state-space models, a close match between observed and predicted oxygen concentrations is not the [only] objective; those models also try to match the [DO] rates of change and have different definitions of DO.mod.
Other ways of assessing model output:
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/USGS-R/streamMetabolizer/issues/367#issuecomment-384341968, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ALgqdqrnbMHUH16fdLCOsbVLhBUw8fgPks5tsJ7LgaJpZM4Td0Xf.
So, this has been a while - sorry about that.
@robohall I had a look at some literature, and I would guess that K600 is at around 1 cm/h for windspeed=0. So, the tanks are 1 m deep, so my init.K600.daily would be 1 - is something like that what you had in mind? Pooling k600 - at what point would I do that - after estimating for all the ponds, or also before as a model option. I also realized that I am working with constant air pressure (950 hPa), I forgot that I put it in as a placeholder, meant to replace it at some point - does pressure affect the models a lot?
@aappling-usgs I focused on the the starting times, because they had such big effect on the outcomes. I used a range of different starting times for the each week-long chunk: from -1.5 hrs before to 1.5 after the first light measurement, in 0.25 steps. Thus, for each chunk there are 13 models, for which I then took the output from the three models with the best RMS, and averaged them. This gave me good consistency, but I still don't really trust the values: respiration is always much higher than productivity, which is hard to believe given the overall trend of overall increasing oxygen saturation throughout the experiment. I have attached predictions and model estimates for one chunk in mid summer that really should be net productive. Out of 13 possible models, 9 ran without producing just NAs. So, in both plots, each facet shows a different starting time for the same data, and the number above are the hours before or after sunrise (the first measured light).
input data: mod.data.txt full model output: mod.out.txt
Moritz,
Don’t forget to correct for units, 1 cm/h is 0.24 m/d, multiply by depth and get 0.24/d. log (0.24) = -1.4. So I would pick a lognormal prior,( -1.4, 1)
View the prior dist as x<-seq(0.05,2,0.05) plot(x, dlnorm(x,-1.4,1))
You could safely broaden that prior to dlnorm(0, 1) and might be what I would do to guard againstthinking that you have that much resolotion to define and measure gas exchange when it is that low.
As for pooling, you could try to analyze all metabolism from all ponds using one model with K pooled across all.
ER >> GPP is weird unless you are adding external C or the tanks have high ER from productivity that occurred before you measure metabolism. But ER is difficult to measure.
Bob
On Jun 7, 2018, at 9:33 AM, Moritz Lürig notifications@github.com<mailto:notifications@github.com> wrote:
So, this has been a while - sorry about that.
@robohallhttps://github.com/robohall I had a look at some literature, and I would guess that K600 is at around 1 cm/h for windspeed=0. So, the tanks are 1 m deep, so my init.K600.daily would be 1 - is something like that what you had in mind? Pooling k600 - at what point would I do that - after estimating for all the ponds, or also before as a model option. I also realized that I am working with constant air pressure (950 hPa), I forgot that I put it in as a placeholder, meant to replace it at some point - does pressure affect the models a lot?
@aappling-usgshttps://github.com/aappling-usgs I focused on the the starting times, because they had such big effect on the outcomes. I used a range of different starting times for the each week-long chunk: from -1.5 hrs before to 1.5 after the first light measurement, in 0.25 steps. Thus, for each chunk there are 13 models, for which I then took the output from the three models with the best RMS, and averaged them. This gave me good consistency, but I still don't really trust the values: respiration is always much higher than productivity, which is hard to believe given the overall trend of overall increasing oxygen saturation throughout the experiment. I have attached predictions and model estimates for one chunk in mid summer that really should be net productive. Out of 13 possible models, 9 ran without producing just NAs. So, in both plots, each facet shows a different starting time for the same data, and the number above are the hours before or after sunrise (the first measured light).
input data: mod.data.txthttps://github.com/USGS-R/streamMetabolizer/files/2080881/mod.data.txt full model output: mod.out.txthttps://github.com/USGS-R/streamMetabolizer/files/2080885/mod.out.txt
[stream_metab_do_pred]https://user-images.githubusercontent.com/15648068/41106646-bb26f86a-6a70-11e8-8f41-3c8017a0cfab.png [stream_metab_est]https://user-images.githubusercontent.com/15648068/41106645-bb0be746-6a70-11e8-875d-4c7393255d88.png
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/USGS-R/streamMetabolizer/issues/367#issuecomment-395465046, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ALgqdn40CUAtgYtZ66fpl9nLD1Bwn68Lks5t6UfWgaJpZM4Td0Xf.
@robohall, Hi Bob,
thanks for your suggestions. I gave bayesian models a try and found that they seem to work better with my data (see model specs below). P:R ratio is now flickering around 1 throughout the experiment, which seems more meaningful than what I had before.
The problem that remains are the "chopped off peaks" (see plot below, same data as above/before). As @aappling-usgs suggested this may be due to bad estimates of K600. I used pool_K600='normal'
, but that's not what you had in mind with "analyze all metabolism from all ponds using one model with K pooled across all", is it? How would I do that?
I am also still struggling with specifying a lognormal prior, which, as I understand, for bayes-models works with K600_daily_meanlog_meanlog
rather than init.K600.daily
, right? So, how would I by your suggestion specifiy a prior of dlnorm(0, 1) for a bayes model?
Thanks very much again!
mod_name <- mm_name(type='bayes', pool_K600='normal')
mod_specs <- specs(mod_name, day_start=t, day_end=24+t)
mod_fit1 <- metab(mod_specs, data=mod.data)
mod_fit1
metab_model of type metab_bayes
streamMetabolizer version 0.10.9
Specifications:
model_name b_Kn_oipi_tr_plrckm.stan
engine stan
split_dates FALSE
keep_mcmcs TRUE
keep_mcmc_data TRUE
day_start 7
day_end 31
day_tests full_day, even_timesteps, complete_data, pos_discharge
required_timestep NA
GPP_daily_mu 3.1
GPP_daily_lower -Inf
GPP_daily_sigma 6
ER_daily_mu -7.1
ER_daily_upper Inf
ER_daily_sigma 7.1
K600_daily_meanlog_meanlog 2.484906649788
K600_daily_meanlog_sdlog 1.32
K600_daily_sdlog_sigma 0.05
err_obs_iid_sigma_scale 0.03
err_proc_iid_sigma_scale 5
params_in GPP_daily_mu, GPP_daily_lower, GPP_daily_sigma, ER_daily_mu, ER_daily_upper, ER_daily_sigma, K600_daily_meanlog_meanlog, K600_daily_meanlog_sd...
params_out GPP, ER, GPP_daily, ER_daily, K600_daily, K600_daily_predlog, K600_daily_sdlog, err_obs_iid_sigma, err_obs_iid, err_proc_iid_sigma, err_proc_iid
n_chains 4
n_cores 4
burnin_steps 500
saved_steps 500
thin_steps 1
verbose FALSE
model_path E:/R/01_my_library/streamMetabolizer/models/b_Kn_oipi_tr_plrckm.stan
Fitting time: 88.46 secs elapsed
Parameters (6 dates):
date GPP.daily GPP.daily.lower GPP.daily.upper ER.daily ER.daily.lower ER.daily.upper K600.daily K600.daily.lower K600.daily.upper msgs.fit
1 2015-09-26 0.1448576 0.1112568 0.1770925 -0.1057234 -0.1690940 -0.02520984 0.06866319 0.03919514 0.1446102 w
2 2015-09-27 0.2062770 0.1599451 0.2515621 -0.2950681 -0.3624439 -0.22003165 0.06875709 0.04028494 0.1438024 w
3 2015-09-28 0.3768529 0.3368254 0.4129556 -0.2729357 -0.3335530 -0.20351164 0.06869980 0.04081866 0.1428849 w
4 2015-09-29 0.2438324 0.2003844 0.2792674 -0.1572114 -0.2468089 -0.08244864 0.06838458 0.04089358 0.1479235 w
5 2015-09-30 0.2380278 0.2009854 0.2891546 -0.1878116 -0.2560476 -0.12553222 0.06856512 0.04044155 0.1455956 w
6 2015-10-01 NA NA NA NA NA NA NA NA NA w E
Fitting warnings:
There were 8 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
There were 8 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
http://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded
There were 4 chains where the estimated Bayesian Fraction of Missing Information was low. See
http://mc-stan.org/misc/warnings.html#bfmi-low
Examine the pairs() plot to diagnose sampling problems
6 dates: overall warnings
Fitting errors:
1 date: data don't end when expected
1 date: NAs in light
Predictions (6 dates):
# A tibble: 6 x 9
date GPP GPP.lower GPP.upper ER ER.lower ER.upper msgs.fit msgs.pred
<date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 2015-09-26 0.145 0.111 0.177 -0.106 -0.169 -0.0252 "w " " "
2 2015-09-27 0.206 0.160 0.252 -0.295 -0.362 -0.220 "w " " "
3 2015-09-28 0.377 0.337 0.413 -0.273 -0.334 -0.204 "w " " "
4 2015-09-29 0.244 0.200 0.279 -0.157 -0.247 -0.0824 "w " " "
5 2015-09-30 0.238 0.201 0.289 -0.188 -0.256 -0.126 "w " " "
6 2015-10-01 NA NA NA NA NA NA w E " "
Moritz,
Yes Kn is the pooling option I would use.
I now see that it may not be easy to do all tanks at once because although sM does not include time in models (today does not depend on yesterday), it does use time to decide what a ‘day’ is. You could be sneaky and give each tank its proper day-month, but make up any year to distinguish tanks, i.e. tank 6 could be 2026. That way you pool across all tanks, which should be fine. I would do this, especially if individual tanks give very different values of K.
for the prior set
K600_daily_meanlog_meanlog = 0
K600_daily_meanlog_sdlog =1
Don’t worry about inits unless the MCMC chains get lost.
I see based on the blue graph below, you have lots of process error. There may be processes going on in the tanks—probably poorly mixed—than the model is considering. Check to make sure the process error you get does not exceed the prior. It should be fine since I think we used a half cauchy prior dist. for that.
Double and triple check your light data to make sure its time is correct and that this time matches that from the DO sensors. A typical problem is when folks have light in one time zone and oxygen in another due to accounting (or not) for summer time changes.
Bob
On Jul 5, 2018, at 12:19 PM, Moritz Lürig notifications@github.com<mailto:notifications@github.com> wrote:
@robohallhttps://github.com/robohall, Hi Bob,
thanks for your suggestions. I gave bayesian models a try and found that they seem to work better with my data (see model specs below). P:R ratio is now flickering around 1 throughout the experiment, which seems more meaningful than what I had before. The problem that remains are the "chopped off peaks" (see plot below, same data as above/before). As @aappling-usgshttps://github.com/aappling-usgs suggested this may be due to bad estimates of K600. I used pool_K600='normal', but that's not what you had in mind with "analyze all metabolism from all ponds using one model with K pooled across all", is it? How would I do that?
I am also still struggling with specifying a lognormal prior, which, as I understand, for bayes-models works with K600_daily_meanlog_meanlog rather than init.K600.daily, right? So, how would I by your suggestion specifiy a prior of dlnorm(0, 1) for a bayes model?
Thanks very much again!
[plot_do_preds]https://user-images.githubusercontent.com/15648068/42340445-4f695114-8090-11e8-971c-e96b91e015cf.png
mod_name <- mm_name(type='bayes', pool_K600='normal') mod_specs <- specs(mod_name, day_start=t, day_end=24+t) mod_fit1 <- metab(mod_specs, data=mod.data) mod_fit1
metab_model of type metab_bayes streamMetabolizer version 0.10.9 Specifications: model_name b_Kn_oipi_tr_plrckm.stan engine stan split_dates FALSE keep_mcmcs TRUE keep_mcmc_data TRUE day_start 7 day_end 31 day_tests full_day, even_timesteps, complete_data, pos_discharge required_timestep NA GPP_daily_mu 3.1 GPP_daily_lower -Inf GPP_daily_sigma 6 ER_daily_mu -7.1 ER_daily_upper Inf ER_daily_sigma 7.1 K600_daily_meanlog_meanlog 2.484906649788 K600_daily_meanlog_sdlog 1.32 K600_daily_sdlog_sigma 0.05 err_obs_iid_sigma_scale 0.03 err_proc_iid_sigma_scale 5 params_in GPP_daily_mu, GPP_daily_lower, GPP_daily_sigma, ER_daily_mu, ER_daily_upper, ER_daily_sigma, K600_daily_meanlog_meanlog, K600_daily_meanlog_sd... params_out GPP, ER, GPP_daily, ER_daily, K600_daily, K600_daily_predlog, K600_daily_sdlog, err_obs_iid_sigma, err_obs_iid, err_proc_iid_sigma, err_proc_iid n_chains 4 n_cores 4 burnin_steps 500 saved_steps 500 thin_steps 1 verbose FALSE model_path E:/R/01_my_library/streamMetabolizer/models/b_Kn_oipi_tr_plrckm.stan Fitting time: 88.46 secs elapsed Parameters (6 dates): date GPP.daily GPP.daily.lower GPP.daily.upper ER.daily ER.daily.lower ER.daily.upper K600.daily K600.daily.lower K600.daily.upper msgs.fit 1 2015-09-26 0.1448576 0.1112568 0.1770925 -0.1057234 -0.1690940 -0.02520984 0.06866319 0.03919514 0.1446102 w 2 2015-09-27 0.2062770 0.1599451 0.2515621 -0.2950681 -0.3624439 -0.22003165 0.06875709 0.04028494 0.1438024 w 3 2015-09-28 0.3768529 0.3368254 0.4129556 -0.2729357 -0.3335530 -0.20351164 0.06869980 0.04081866 0.1428849 w 4 2015-09-29 0.2438324 0.2003844 0.2792674 -0.1572114 -0.2468089 -0.08244864 0.06838458 0.04089358 0.1479235 w 5 2015-09-30 0.2380278 0.2009854 0.2891546 -0.1878116 -0.2560476 -0.12553222 0.06856512 0.04044155 0.1455956 w 6 2015-10-01 NA NA NA NA NA NA NA NA NA w E Fitting warnings: There were 8 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup There were 8 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See http://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded There were 4 chains where the estimated Bayesian Fraction of Missing Information was low. See http://mc-stan.org/misc/warnings.html#bfmi-low Examine the pairs() plot to diagnose sampling problems 6 dates: overall warnings Fitting errors: 1 date: data don't end when expected 1 date: NAs in light Predictions (6 dates):
date GPP GPP.lower GPP.upper ER ER.lower ER.upper msgs.fit msgs.pred
Hi Bob,
Pooling: I tried the pooling approach by pond=year (nice idea btw), but it didn't really affect my results for GPP and ER; K600 was always between 0.5 and 1). I think that is reasonable comparing the numbers to literature. I was wondering whether it would make sense to fix K600 completely. Some colleagues from the aquatic physics department here at Eawag have found that there is virtually no mixing in artificial ponds. The ones they measured was much bigger (15000L), but still remained completely stratified. Is it possible to fix K600 or make it linearly dependent on temp? Or can I constrain the model somehow to a range of small values? Either way, I think any model returning very small K600 like should well reflect the (poor) mixing dynamics in these tanks.
Process error: I found that this depends i) on the start of my day, ii) how much I smooth my data.
If I wanted to quantify the process error (how well does the blue curve fit the dots), how might I do that?
I'm gonna close this. Thanks SO MUCH for your feedback, this was very helpful and also fun, getting the models to work. I'll share the paper when it's out. Given that more and more people do experiments in mesocosms (and I do think here metabolism is somewhat under-researched) it might be interesting and useful to have a setting to tweak the models towards experimental ponds, with little or no mixing.
I am using 15 min interval environmental time series data to model GPP and ER over several consecutive days (a formatted sample is attached). However, I am getting some really weird results. There are these extreme jumps in ER from negative to positive (ER can't be positive, no?), and also predictions for O2-sat seems off - or is this just a labelling-error (see attached plot - 10-11%)?
mod.out.txt mod.data.txt
I am not quite sure where to start. I have only been playing around with "mle" models, not even sure is my data is fine the way it is. E.g. light - does it have to be smoothed more (I have been using raw)? If you have time to have a look or just point me in the right direction I would really appreciate it.