Debugging fitting methods based on exponential simulation results

parksw3 commented 2 years ago

Truncation adjusted performs poorly when the delay is short probably because we're not taking censoring into account?
Why is filtered performing poorly? It seems like we're filtering more than enough (or not)?
Latent variable is having a hard time even in the stable case

parksw3 commented 2 years ago

Found one problem. Previously, we had delay_daily=floor(delay) but this is different from what we actually observe, stime_daily-ptime_daily.

For example, if ptime=1.9 and stime=2, then true delay is 0.1 but the daily delay is 1. Instead, floor(delay) = 0. This might allow some of the discrete methods to work better.

parksw3 commented 2 years ago

stime_daily-ptime_daily is breaking something now... need to fix... figured it out...!

Previously, we had

truncated_linelist <- linelist |>
    data.table::copy() |>
    # Update observation time by when we are looking
    DT(, obs_at := obs_time) |>
    DT(, obs_time := obs_time - ptime) |>
    # I've assumed truncation in the middle of the censoring window.
    # For discussion.
    DT(, censored_obs_time := obs_time - (ptime_daily + 0.5)) |>
    DT(, censored := "interval") |>
    DT(stime <= obs_time)

But obs_time in defining censored_obs_time was getting obs_time from obs_time - ptime rather than the function argument. So I changed it to DT(, censored_obs_time := obs_at - (ptime_daily + 0.5)). Running again now.

parksw3 commented 2 years ago

Changing delay=stime_daily-ptime_daily and DT(, censored_obs_time := obs_at - (ptime_daily + 0.5)) fixed something because we're getting good estimates during the decay phase, when there should be barely any truncation. But it looks like other methods aren't working (all giving same estimates?). Two conclusiosn:

Before, the latent variable was giving a different estimate only because all the discrete methods were using the wrong delay.
Something is wrong. All methods are giving same answers (I think this issue may have been persistent before, but we just didn't see it? or maybe this is something new?).

parksw3 commented 2 years ago

Fixed it. DT(stime <= obs_time) was problematic for the same reason as above. Changing to DT(stime <= obs_at) fixed it.

parksw3 commented 2 years ago

New results. Truncation seems to be working for the fast growth phase but r=0 scenario still looks buggy.
Filtered code was filtering based on stime which was causing the problem. I fixed it to filter based on ptime. We want early cohorts that are not subject to truncation biases.

parksw3 commented 2 years ago

New results with new parameters.

truncation_adjusted is really weird for the stable scenario
When I try to fit truncation_adjusted manually, it seems OK. Maybe something wrong with targets or somewhere in the pipeline...? Or just a weird sample,

parksw3 commented 2 years ago

Ran a new simulation with larger sample size. Also changing the labels on the xaxis and putting parameter names as a title.

Everything looks a little more believable
Truncation adjusted is overestimating sdlog by a large amount in the stable case...why?
I've started using a longer delay so filtered needs to filter a lot more. But still better than naive.
Compare sdlog estimates for naive vs censoring for the stable scenario. Censoring helps get better sdlog estimates. This might explain why naive truncation giving bad estimates for sdlog.

seabbs commented 2 years ago

Nice work. This is all looking good.

Compare sdlog estimates for naive vs censoring for the stable scenario. Censoring helps get better sdlog estimates. This might explain why naive truncation giving bad estimates for sdlog.

I think this is plausible though on the face of it I would have also expected a bias in the meanlog.

The latent model looks like it slightly underestimates the sdlog - plausibly the issue with a uniform prior showing up?

parksw3 commented 2 years ago

The latent model looks like it slightly underestimates the sdlog - plausibly the issue with a uniform prior showing up?

I was wondering about this as well. I think the problem is that we can't tell from this one particular simulation. Is this just a bad sample (95% should contain truth 95% of the time so are we just in the 1/20 case)? Or is the method actually doing something wrong?

Either

Increase sample size further. This is easy but just takes a longer time. I've been using smaller sample sizes for debugging purposes so far but we can bump it up easily when we feel ready.
Run a bunch of simulations and test each method for a bunch of samples. Ideal but also way too time consuming...

seabbs commented 2 years ago

I've run the full pipeline on more samples etc here: https://github.com/parksw3/dynamicaltruncation/pull/20

Has some issues with the filtering models as can end up with no samples and therefore errors.

seabbs commented 2 years ago

As I said elsewhere we might want to do some more formal simulation-based calibration to get a handle on this but ideally if we can use what we are doing anyway that seems like a good idea as adds quite a lot of overhead.

parksw3 commented 2 years ago

Going to close this for now. Don't think we need to look at these again.

parksw3 / epidist-paper

Debugging fitting methods based on exponential simulation results #13