GLEON / LakeMetabolizer

Collection of Lake Metabolism Functions
16 stars 10 forks source link

I Botched observation vs process error, yeah? #104

Closed rBatt closed 9 years ago

rBatt commented 9 years ago

I think I screwed up folks.

Gordon, thankfully (luckily) pointed this out in his review

Say your estimate is S, your observation is Y, and your true value is X.

Simple Example

Say the true process is
X[t] = X[t-1] + eta[t]

The observations are
Y[t] = X[t] + epsilon[t]

To make things simple, say that for the first time step,
S[1] = X[1] = 0
i.e., our estimate is equal to the true value which is equal to 0.

Observation Error only NLL Function

If you fit a model like this:

S[t] = S[t-1] # loop through this
-sum(dnorm(S-Y, 0, sigma)) # this would be the NLL

It is a process-only model, and sigma is the standard deviation of epsilon. The intuition here is that in a world where there are only observation errors, our process will be exactly correct, so the only reason S would differ from Y (what's in dnorm() there) is due to observation error. Note that in metab.mle alpha is on both sides of the equation (note that this piece is actually implemented in C here).

Process Error Only NLL Function

If you fit a model like this:

S[t] = Y[t-1] # loop through
-sum(dnorm(S-Y, 0, sigma)) # nll

It is a process-only model, and sigma is the standard deviation of eta. In this scenario, we are assuming observations are perfect, so really Y == X! Therefore, if we get S[t] from Y[t-1], and things don't match up perfectly, it's not because our observation wasn't good, it's because our process connecting S to Y isn't good.

So ... what does this mean for the package?

The process error only model is not what metab.mle does. It's actually what metab.ols does. Making a simple assumption about the covariance structure of the model terms, MLE and OLS are identical (if the mean and the residuals aren't correlated). That being said, MLE does offer some flexibility over OLS.

So we can choose to program a second metab.mle where we switch out alpha for the observation, and call it the PE-only version (maybe we just make this an argument to pass to metab.mle), or we can just say we're actually presenting an OE-only model. Which, as Gordon points out, I don't think anyone has done.

So, to summarize, we have a few options:

  1. Keep all code as-is, and say
    a. metab.ols is process-error-only
    b. metab.mle is observation-error-only
    c. we'd also say that this is not the MLE from Hanson or Solomon
  2. Change metab.mle to PE-only
    a. Switch the alpha on the right-side of the equation to the observation
    b. adjust manuscript text to say that we are using observation error models
  3. Add option to metab.mle
    a. Add argument like error.type, match.arg(error.type, choices=c("OE","PE")
    b. for error.type="PE" we make change in 2
    c. for error.type="OE" we would use the current code
    d. we'd probably have 2 different NLL functions

Thoughts? Can anyone independently verify that I'm right or wrong about any of this?

rBatt commented 9 years ago

In particular, @lawinslow @pchanson @jzwart might have thoughts or preferences

lawinslow commented 9 years ago

Yeah, we botched it. Any of us could have caught it.

Due to the already-existing use of the package, I'm temped to go with option 3.

pchanson commented 9 years ago

Luke

Not sure I have the history on this. Anything important I can help with?

Paul

Sent from my U.S. Cellular® Smartphone

-------- Original message -------- From: Luke Winslow notifications@github.com Date: 08/31/2015 7:14 AM (GMT-07:00) To: GLEON/LakeMetabolizer LakeMetabolizer@noreply.github.com Cc: Paul Hanson pchanson@wisc.edu Subject: Re: [LakeMetabolizer] I Botched observation vs process error, yeah? (#104)

Yeah, we botched it. Any of us could have caught it.

Due to the already-existing use of the package, I'm temped to go with option 3.

Reply to this email directly or view it on GitHubhttps://github.com/GLEON/LakeMetabolizer/issues/104#issuecomment-136367277.

lawinslow commented 9 years ago

@pchanson History: Reviewer caught a mismatch in our code/description. We called our MLE model "pure process error", but our implementation is actually "pure observation error"

I think to address this, we just implement both OE and PE models. It's an easy way to address that review. Thoughts? I'm not aware of there being a preferred version of that model (process vs. observation error).

pchanson commented 9 years ago

interesting and important detail that probably would not change the outcome much. I think most implementations are a hybrid, with a, e.g., daily reset to observations.

Sent from my U.S. Cellular® Smartphone

-------- Original message -------- From: Luke Winslow notifications@github.com Date: 08/31/2015 7:25 AM (GMT-07:00) To: GLEON/LakeMetabolizer LakeMetabolizer@noreply.github.com Cc: Paul Hanson pchanson@wisc.edu Subject: Re: [LakeMetabolizer] I Botched observation vs process error, yeah? (#104)

@pchansonhttps://github.com/pchanson History: Reviewer caught a mismatch in our code/description. We called our MLE model "pure process error", but our implementation is actually "pure observation error"

I think to address this, we just implement both OE and PE models. It's an easy way to address that review. Thoughts? I'm not aware of there being a preferred version of that model (process vs. observation error).

Reply to this email directly or view it on GitHubhttps://github.com/GLEON/LakeMetabolizer/issues/104#issuecomment-136369991.

pchanson commented 9 years ago

and... I'm unaware of any pure PE implementations in metab, though I've paid close attention

Sent from my U.S. Cellular® Smartphone

-------- Original message -------- From: Luke Winslow notifications@github.com Date: 08/31/2015 7:25 AM (GMT-07:00) To: GLEON/LakeMetabolizer LakeMetabolizer@noreply.github.com Cc: Paul Hanson pchanson@wisc.edu Subject: Re: [LakeMetabolizer] I Botched observation vs process error, yeah? (#104)

@pchansonhttps://github.com/pchanson History: Reviewer caught a mismatch in our code/description. We called our MLE model "pure process error", but our implementation is actually "pure observation error"

I think to address this, we just implement both OE and PE models. It's an easy way to address that review. Thoughts? I'm not aware of there being a preferred version of that model (process vs. observation error).

Reply to this email directly or view it on GitHubhttps://github.com/GLEON/LakeMetabolizer/issues/104#issuecomment-136369991.

jzwart commented 9 years ago

Yeah this was a good catch. I agree that option 3 is the best, and then 1 and then 2 for the reason Luke gives. May want to keep OE as default if we go with option 3? just because others have been using it this way already..

rBatt commented 9 years ago

I agree w/ what Jake just said.

On Mon, Aug 31, 2015 at 5:38 PM, Jake Zwart notifications@github.com wrote:

Yeah this was a good catch. I agree that option 3 is the best, and then 1 and then 2 for the reason Luke gives. May want to keep OE as default if we go with option 3? just because others have been using it this way already..

— Reply to this email directly or view it on GitHub https://github.com/GLEON/LakeMetabolizer/issues/104#issuecomment-136508854 .