Issues from this morning

bpbond commented 1 year ago

[x] Better comment lines 59-66 (explain where 'magic numbers' come from)
[ ] Does AP_P get used at all?
[x] Put in a comment what you explained so brilliantly: why only 13C is used for k0 estimation

bpbond commented 1 year ago

Hi @kendalynnm I took another hard look at the code and what it's doing, opened a PR improving a few things, and have some thoughts.

Predicted concentrations are way off

> incdat_out
# A tibble: 5 × 12
# Groups:   id [1]
  id    round   vol time_days cal12CH4ml cal13CH…¹ AP_obs     mt     nt AP_pred    Pt    Ct
  <chr> <chr> <dbl>     <dbl>      <dbl>     <dbl>  <dbl>  <dbl>  <dbl>   <dbl> <dbl> <dbl>
1 52    T0       10    0            12.1     0.170   1.39   12.3  0.170    1.39  0     NaN 
2 52    T1       10    0.0382       38.6     0.447   1.15   63.0  0.742    1.18  3.34 -614.
3 52    T2       10    0.0653       49.5     0.559   1.12  187.   2.11     1.13  2.37 -318.
4 52    T3       10    0.0944       59.0     0.659   1.11  595.   6.51     1.09  2.55 -240.
5 52    T4       10    0.122        64.0     0.711   1.10 1733.  18.5      1.07  2.37 -102.

Note how the predictions (mt and nt) are very different, except at t=0, from the observations. As a result, of course the predicted production (Pt) and consumption (Ct) numbers are screwy (as we looked at yesterday).

Why are we optimizing against AP?

~I guess there's an infinite number of combinations of P and k that will match some set of observed AP values.~ This isn't true, because here P doesn't produce any labeled methane.

Why aren't we constraining the model against cal12CH4ml and cal13CH4ml? I don't remember...was this basic idea taken from the Brewer code?

From paragraph 22:

We calculate P recursively to find the best simultaneous fits of equations (5) and (11) to the measures of the amount and isotopic composition of methane during the incubation.

Ah ha! Yes, they're optimizing against both AP and total methane.

`P` doesn't match up with the `mt`+`nt` time series

Just to expand on the point above, we get this for sample 52:

# A tibble: 1 × 6
  id        P     k     k0 convergence message                                        
  <chr> <dbl> <dbl>  <dbl>       <int> <chr>                                          
1 52     87.5 -39.4 -0.001           0 CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH

Okay. We would expect that over the first tilmestep (t0->t1) this should mean production of 0.0382 days * 87.5 ml/day = 3.3425 ml total CH4, right? But in that period observed total CH4 jumps from 12.5 to 63.7, i.e. by 51.3. Way bigger. 😬

bpbond commented 1 year ago

Ah, I knew there was something I was forgetting.

Should k be positive?

The classic formulation of a first-order rate constant is something like exp(-kt), wth k positive. Should it be here, too? Note that a number of the equations follow this form:

    mt <- (P/k - (P/k - m0) * exp(-k * time))

...and k values in Table 1 of VFH2002 are positive!

bpbond commented 1 year ago

There is production of labeled methane in sample 52

cal13CH4ml increases over time in sample 52, the only sample this happens in.

Conversely, "there is no production of labeled methane during incubation" (paragraph 15 in VFH2002).

If this is correct we can't use the simplified equation 9, but rather need to include a production term there, too (see preceding paragraph 15).

bpbond commented 1 year ago

Where things are with the code currently pushed to #15 branch:

We now optimize against both m (total methane) and n (labeled methane)
k0 estimation fixed and handles both positive and negative slopes
k is (always) positive and nonzero
Labeled methane now has a production term (versus vFH2002 Eq. 8 which is assumes no 13CH4 production)

This code now produces excellent fits for total methane: m_pred

For AP (atom fraction of 13C), however, about 25% of the samples don't get good results: ap_pred

Next step: figure out what distinguishes those poor-fit samples. Progress!

bpbond commented 1 year ago

ap_fits

bpbond commented 1 year ago

Most of this has been resolved, so closing. Will open a new issue for graph above.

COMPASS-DOE / grossMethane