cschwarz-stat-sfu-ca / BTSPAS

Bayesian Time Stratified Petersen Analysis System
1 stars 3 forks source link

Fit does not go to 0 at end of series #31

Closed cschwarz-stat-sfu-ca closed 2 years ago

cschwarz-stat-sfu-ca commented 2 years ago

This was moved from Issue #30 to create new issue.

However, I am running into another issue with the data set I previously shared and was curious if you have any insight.

In short, the model is having issues converging on a "believable" estimate when I specify maiden catch (u2) as "bad" for 1-2 periods near the end of the time series (e.g., bad.u2 <-c(10,11)). For context, I want to identify these periods as bad because streamflow got so low that the trap essentially stopped spinning and yearling migrants can then avoid the trap.

When I run the model without specifying periods 10 and/or 11 as bad, the model estimates abundance declining towards zero in the final periods, which is what we would expect. However, when I specify u2 in periods 10 and/or 11 as bad, the model estimates that abundance increases quite dramatically towards the tail end of the time series (and the precision blows up). I've run this model with up to 500K iterations/250 burn-in per chain (4 chains) and still have issues. Looking at the model outputs, there are clearly some issues (e.g., estimates of logitP >5000 and < -5000 - agh!!).

This specific time series is pretty short (only 13 periods) and the 1-2 bad periods are right after peak outmigration both of which perhaps are contributing to the estimation problems. To troubleshoot, I toyed around with adding additional "fake periods" of zero catch with the idea of helping the model realize catch should decline to zero but this doesn't seem to help.

Here's the problematic dataset: table_Run5_Final_Catch_Summary.csv table_Run5_Final_Catch_Summary.csv

cschwarz-stat-sfu-ca commented 2 years ago

The problem is that there is no evidence of a decline in run when the heavy rains start. So the spline does not "know" that it needs to dip down at the end.

Easiest solution is the the "1" trick in the last vignette where you add a fake release with a recovery of 1 fish that is marked and no unmarked fish so that BTSPAS thinks that the P(capture) is high with 0 unmarked seen which implies that the run must also be slow.

Try the code below:

# BTSPAS problem

library(BTSPAS)
library(readxl)

testdata <- read.csv("table_Run5_Final_Catch_Summary.csv", header=TRUE, as.is=TRUE, strip.white=TRUE)

n1 <- testdata$Marks
m.col <- names(testdata)[grepl("Period_Plus",names(testdata))]
m2 <- as.matrix(testdata[,m.col])
u2 <- testdata$Maidens
Period_model <- testdata$Final_Period

n.chain<-4
n.iter<-200000
n.burnin<-100000
n.sims<-500
(n.thin<-(n.iter - n.burnin)/n.sims)
(n.draws<-n.sims*n.chain)
bad.n1 <- NULL # Denote marks by "Final_Periods" as "bad"
bad.m2 <- c(10) # Denote recaps by "Final_Periods" as "bad"
bad.m2 <- NULL
bad.u2 <- c(10) # Denote maidens by "Final_Periods" as "bad"
bad.u2 <- NULL
# include_flow_covar<-c("No") # Enter "Yes" or "No"
# include_panel_covar<-c("No") # Enter "Yes" or "No"

start.time<-Sys.time()
BTSPAS_Res <-
try(TimeStratPetersenNonDiagError_fit(
title= "BTSPAS_NonDiag"
, prefix= "BTSPAS problem"
, time= Period_model
, n1= n1
, m2= m2
, u2= u2
#, sampfrac= sampfrac
#, jump.after= jump.after
, bad.n1= bad.n1
, bad.m2= bad.m2
, bad.u2= bad.u2
#, logitP.cov = logitP.cov
, n.chains = n.chain
, n.iter = n.iter
, n.burnin = n.burnin
, n.sims = n.sims
, tauU.alpha = 1
, tauU.beta = 0.05
, taueU.alpha = 1
, taueU.beta = 0.05
# , mu_xiP = logit(sum(m2, na.rm = TRUE)/sum(n1, na.rm = TRUE))
# , tau_xiP = 1/var(logit((m2 + 0.5)/(n1 + 1)), na.rm = TRUE)
, tauP.alpha = 0.001
, tauP.beta = 0.001
, run.prob = seq(0, 1, 0.1)
, debug = FALSE
, debug2 = FALSE
#, InitialSeed= ceiling(runif(1,min=0, max=if(engine=="jags"){1000000}else{14}))
))

# the fit has abundance increasing without end at the end of the study
BTSPAS_Res$plots$fit.plot

# the logit(P) show p estimated at 0 or 1, not good
BTSPAS_Res$plots$logitP.plot

rownames(BTSPAS_Res$summary)
BTSPAS_Res$summary[ grepl("logitP",rownames(BTSPAS_Res$summary)) |
                    grepl("Ntot"  ,rownames(BTSPAS_Res$summary)),]

#---------------------------------------------------
# Set up bad counts at sample times 10 and 11 because water flow was so large

bad.u2 <- c(10,11) # Denote maidens by "Final_Periods" as "bad"

BTSPAS_Res2 <-
  try(TimeStratPetersenNonDiagError_fit(
    title= "BTSPAS_NonDiag"
    , prefix= "BTSPAS problem2"
    , time= Period_model
    , n1= n1
    , m2= m2
    , u2= u2
    #, sampfrac= sampfrac
    #, jump.after= jump.after
    , bad.n1= bad.n1
    , bad.m2= bad.m2
    , bad.u2= bad.u2
    #, logitP.cov = logitP.cov
    , n.chains = n.chain
    , n.iter = n.iter
    , n.burnin = n.burnin
    , n.sims = n.sims
    , tauU.alpha = 1
    , tauU.beta = 0.05
    , taueU.alpha = 1
    , taueU.beta = 0.05
    # , mu_xiP = logit(sum(m2, na.rm = TRUE)/sum(n1, na.rm = TRUE))
    # , tau_xiP = 1/var(logit((m2 + 0.5)/(n1 + 1)), na.rm = TRUE)
    , tauP.alpha = 0.001
    , tauP.beta = 0.001
    , run.prob = seq(0, 1, 0.1)
    , debug = FALSE
    , debug2 = FALSE
    #, InitialSeed= ceiling(runif(1,min=0, max=if(engine=="jags"){1000000}else{14}))
  ))

# the fit has abundance increasing without end at the end of the study
BTSPAS_Res2$plots$fit.plot

# the logit(P) show p estimated at 0 or 1, not good
BTSPAS_Res2$plots$logitP.plot

rownames(BTSPAS_Res2$summary)
BTSPAS_Res2$summary[ grepl("logitP",rownames(BTSPAS_Res2$summary)) |
                     grepl("Ntot"  ,rownames(BTSPAS_Res2$summary)),]

#---------------------------------------------------
# Set up bad counts at sample times 10 and 11 because water flow was so large
# force curve to go to zero

n1[12] <- 1
m2[12,1] <- 1
u2[12] <- 0

BTSPAS_Res3 <-
  try(TimeStratPetersenNonDiagError_fit(
    title= "BTSPAS_NonDiag"
    , prefix= "BTSPAS problem2"
    , time= Period_model
    , n1= n1
    , m2= m2
    , u2= u2
    #, sampfrac= sampfrac
    #, jump.after= jump.after
    , bad.n1= bad.n1
    , bad.m2= bad.m2
    , bad.u2= bad.u2
    #, logitP.cov = logitP.cov
    , n.chains = n.chain
    , n.iter = n.iter
    , n.burnin = n.burnin
    , n.sims = n.sims
    , tauU.alpha = 1
    , tauU.beta = 0.05
    , taueU.alpha = 1
    , taueU.beta = 0.05
    # , mu_xiP = logit(sum(m2, na.rm = TRUE)/sum(n1, na.rm = TRUE))
    # , tau_xiP = 1/var(logit((m2 + 0.5)/(n1 + 1)), na.rm = TRUE)
    , tauP.alpha = 0.001
    , tauP.beta = 0.001
    , run.prob = seq(0, 1, 0.1)
    , debug = FALSE
    , debug2 = FALSE
    #, InitialSeed= ceiling(runif(1,min=0, max=if(engine=="jags"){1000000}else{14}))
  ))

# the fit has abundance increasing without end at the end of the study
BTSPAS_Res3$plots$fit.plot

# the logit(P) show p estimated at 0 or 1, not good
BTSPAS_Res3$plots$logitP.plot

rownames(BTSPAS_Res2$summary)
BTSPAS_Res3$summary[ grepl("logitP",rownames(BTSPAS_Res3$summary)) |
                     grepl("Ntot"  ,rownames(BTSPAS_Res3$summary)),]

#---------------------------------------------------
# Set up bad counts at sample times 10 and 11 because water flow was so large
# force curve to go to zer0
# remove the bad u2 values

bad.u2 <- NULL

n1[12] <- 1
m2[12,1] <- 1
u2[12] <- 0

BTSPAS_Res4 <-
  try(TimeStratPetersenNonDiagError_fit(
    title= "BTSPAS_NonDiag"
    , prefix= "BTSPAS problem2"
    , time= Period_model
    , n1= n1
    , m2= m2
    , u2= u2
    #, sampfrac= sampfrac
    #, jump.after= jump.after
    , bad.n1= bad.n1
    , bad.m2= bad.m2
    , bad.u2= bad.u2
    #, logitP.cov = logitP.cov
    , n.chains = n.chain
    , n.iter = n.iter
    , n.burnin = n.burnin
    , n.sims = n.sims
    , tauU.alpha = 1
    , tauU.beta = 0.05
    , taueU.alpha = 1
    , taueU.beta = 0.05
    # , mu_xiP = logit(sum(m2, na.rm = TRUE)/sum(n1, na.rm = TRUE))
    # , tau_xiP = 1/var(logit((m2 + 0.5)/(n1 + 1)), na.rm = TRUE)
    , tauP.alpha = 0.001
    , tauP.beta = 0.001
    , run.prob = seq(0, 1, 0.1)
    , debug = FALSE
    , debug2 = FALSE
    #, InitialSeed= ceiling(runif(1,min=0, max=if(engine=="jags"){1000000}else{14}))
  ))

# the fit has abundance increasing without end at the end of the study
BTSPAS_Res4$plots$fit.plot

# the logit(P) show p estimated at 0 or 1, not good
BTSPAS_Res4$plots$logitP.plot

rownames(BTSPAS_Res2$summary)
BTSPAS_Res4$summary[ grepl("logitP",rownames(BTSPAS_Res4$summary)) |
                     grepl("Ntot"  ,rownames(BTSPAS_Res4$summary)),]
kalebentley commented 2 years ago

Hi Carl,

I looked through your response a couple of weeks ago but have been slow to respond as I am still having issues with the data set I shared with you before (reattached here - table_Run5_Final_Catch_Summary.csv)

In short, I've tried the "1" trick you suggested but I am still getting wonky results with the non-diagonal model (i.e., the estimates of abundance and capture probability will not converge). Specifically, at least one of the chains ultimately does not mix for despite trying up to 1 M iterations with 900K burn-in per chain and I end up with <50 effective draws. I've never run into this issue before.

Interestingly, the diagonal model converges but there are definitely some delayed recaptures and the total estimate of abundance is quite a bit different between the two models (assuming the non-diagonal total estimate is close to "truth").

One obvious thing about this data set is that the sample sizes for non-diagonal recaptures are pretty small and limited to just a few periods. While this pattern isn't atypical at our traps, and I've never had this sort of issue before, out of curiosity I manually edited the data set to combine off-diagonal recaptures from the "plus 2" and "plus 3" periods. Unfortunately, this did not help. I tried other tricks like this trying to find the "weak link" but no luck.

Again, I've never run into this issue before with the non-diagonal model and so it's a bit puzzling. Any additional thoughts or suggestions would be greatly appreciated.

Kale

cschwarz-stat-sfu-ca commented 2 years ago

Hi Kale... just cleaning up before the new year... I think I got this work using the non-diagonal case previously, but you may have missed it in the threads..... try the following code..

On Mon, Nov 29, 2021 at 11:48 AM Kale Bentley @.***> wrote:

Hi Carl,

I looked through your response a couple of weeks ago but have been slow to respond as I am still having issues with the data set I shared with you before (reattached here - table_Run5_Final_Catch_Summary.csv https://github.com/cschwarz-stat-sfu-ca/BTSPAS/files/7620699/table_Run5_Final_Catch_Summary.csv )

In short, I've tried the "1" trick you suggested but I am still getting wonky results with the non-diagonal model (i.e., the estimates of abundance and capture probability will not converge). Specifically, at least one of the chains ultimately does not mix for despite trying up to 1 M iterations with 900K burn-in per chain and I end up with <50 effective draws. I've never run into this issue before.

Interestingly, the diagonal model converges but there are definitely some delayed recaptures and the total estimate of abundance is quite a bit different between the two models (assuming the non-diagonal total estimate is close to "truth").

One obvious thing about this data set is that the sample sizes for non-diagonal recaptures are pretty small and limited to just a few periods. While this pattern isn't atypical at our traps, and I've never had this sort of issue before, out of curiosity I manually edited the data set to combine off-diagonal recaptures from the "plus 2" and "plus 3" periods. Unfortunately, this did not help. I tried other tricks like this trying to find the "weak link" but no luck.

Again, I've never run into this issue before with the non-diagonal model and so it's a bit puzzling. Any additional thoughts or suggestions would be greatly appreciated.

Kale

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/cschwarz-stat-sfu-ca/BTSPAS/issues/31#issuecomment-981961041, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABXIXRQ2G3ZBDLIQWS26GV3UOPKKVANCNFSM5H5SMY3Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.