cschwarz-stat-sfu-ca / BTSPAS

Bayesian Time Stratified Petersen Analysis System
1 stars 3 forks source link

Issue with PredictivePosteriorPlot.TSPNDE when bad.u2 is not null #28

Open kalebentley opened 3 years ago

kalebentley commented 3 years ago

Hi Carl,

I work for the Washington Department of Fish and Wildlife and a group of us have been using your BTSPAS package (version "2014.0901") for approximately the past five years.

I recently tried to use the non-diagonal function (_TimeStratPetersenNonDiagErrorfit) in the most up-to-date version of BTSPAS (version "2020.9.1") and I am having an issue when I specify maidens (u2) as NA for any period (i.e., bad.u2 is not null). Specifically, the JAGS model will run/finish and outputs will start to be generated but eventually, an error will occur saying "Error in if (sum(temp) > 0 ) [ missing value where TRUE/FALSE needed }.

To debug this issue, I opened the non-diagonal vignette and reproduced the error with the demo dataset (Conne River 1987 Atlantic Salmon Smolts). Again, the model runs and produces all output files unless "bad.u2" is not null. It doesn't matter what u2 period(s) is/are specified as bad (NA) but I arbitrarily chose jday 135 to reproduce the error/issue.

To isolate the issue, I "loaded" all necessary functions and ran the _TimeStratPetersenNonDiagErrorfit function line-by-line. It appear that the issue is occurring in the PredictivePosteriorPlot.TSPNDE function and specifically on line 23 (if(sum(temp)>0){cat(sum(temp), " infinite discrepancy measures set to NA\n")}). By adding a ", na.rm = TRUE inside the sum() argument, the error goes away and the plotting function works. However, this doesn't solve the actual problem. The issue appears to be with the calculation of "discrep" in the PredictivePosterior.TSPNDE function and specifically NA are generated for "d2.u2.o" and thus also "d2.o" (which is d2.m2.o + d2.u2.o).

This same issue (model runs but script ultimately fails during the Predictive Posterior part of the code) if recaptures are specified as NA. Here the error occurs in the dmultinom function on line 82 and/or line 95 of the PredictivePosterior.TSPNDE function. I noticed in the updated version of the model, recaptures are no longer specified as NA if bad/missing but rather 0 (same with marks -- left at zero instead of switching to 1). Once I noticed this change in the non-diagonal "fit" function, I changed my data summarization code so bad/missing recaptures were left as 0 as opposed to NA and this issue seems to have gone away.

Let me know if there is any other information I can provide. Unless the issue is fixed (or figure out what I'm doing wrong), we won't be able to use the updated BTSPAS package as almost every trapping location experienced outages due to COVID-related shutdowns in the spring of 2020.

I appreciate your help. Take care.

Kale

cschwarz-stat-sfu-ca commented 3 years ago

Thanks Kate...I'll have a look in the next bit but can't guarantee a solution before Xmas...Thanks for delving deep into the code.

Carl

cschwarz-stat-sfu-ca commented 3 years ago

This was caused by an error in how the discrepancy statistic was computed for the non-diagnonal case. If u2 is missing, then the u2- E[u2 | model parameters] is also missing and so the sum of the statistic over all time periods is also missing. This caused the error as noted. This is now fixed. The diagonal case routines already had the fix done.

Carl Schwarz

cschwarz-stat-sfu-ca commented 3 years ago

Thanks Kale...I think I've pushed a fix to GitHub, so if you could download the latest version from GitHub and try it out that would be helpful.

I went through all of the routines and standardized how to deal with bad data. n1 and m2. Set to 0. JAGS can now deal properly with a binomial/multinomial with n=0 and x=0 regardless of the value of the probability. u2 set to missing. Don't set u2 to 0 because this then is treated as 0 captured which is not correct.

So if you are stratifying by week and for some reason have no releases in week 2, set n1[2] to 0 and of course m2[2,...] = 0 as well in the data. If u2 is also missing in week2, set it to missing (not to zero).

In theory, it is possible to have n1 >0 and m2 = missing (e.g. you released fish but didn't check for marks). I think setting n1=xxx and m2=NA should also work (don't use the bad.m2 argument). I don't remember how extensively we have tested this case...

Carl.

On Mon, Dec 14, 2020 at 5:49 PM Kale Bentley notifications@github.com wrote:

Hi Carl,

I work for the Washington Department of Fish and Wildlife and a group of us have been using your BTSPAS package (version "2014.0901") for approximately the past five years.

I recently tried to use the non-diagonal function ( TimeStratPetersenNonDiagError_fit) in the most up-to-date version of BTSPAS (version "2020.9.1") and I am having an issue when I specify maidens (u2) as NA for any period (i.e., bad.u2 is not null). Specifically, the JAGS model will run/finish and outputs will start to be generated but eventually, an error will occur saying "Error in if (sum(temp) > 0 ) [ missing value where TRUE/FALSE needed }.

To debug this issue, I opened the non-diagonal vignette and reproduced the error with the demo dataset (Conne River 1987 Atlantic Salmon Smolts). Again, the model runs and produces all output files unless "bad.u2" is not null. It doesn't matter what u2 period(s) is/are specified as bad (NA) but I arbitrarily chose jday 135 to reproduce the error/issue.

To isolate the issue, I "loaded" all necessary functions and ran the TimeStratPetersenNonDiagError_fit function line-by-line. It appear that the issue is occurring in the PredictivePosteriorPlot.TSPNDE function and specifically on line 23 (if(sum(temp)>0){cat(sum(temp), " infinite discrepancy measures set to NA\n")}). By adding a ", na.rm = TRUE inside the sum() argument, the error goes away and the plotting function works. However, this doesn't solve the actual problem. The issue appears to be with the calculation of "discrep" in the PredictivePosterior.TSPNDE function and specifically NA are generated for "d2.u2.o" and thus also "d2.o" (which is d2.m2.o + d2.u2.o).

This same issue (model runs but script ultimately fails during the Predictive Posterior part of the code) if recaptures are specified as NA. Here the error occurs in the dmultinom function on line 82 and/or line 95 of the PredictivePosterior.TSPNDE function. I noticed in the updated version of the model, recaptures are no longer specified as NA if bad/missing but rather 0 (same with marks -- left at zero instead of switching to 1). Once I noticed this change in the non-diagonal "fit" function, I changed my data summarization code so bad/missing recaptures were left as 0 as opposed to NA and this issue seems to have gone away.

Let me know if there is any other information I can provide. Unless the issue is fixed (or figure out what I'm doing wrong), we won't be able to use the updated BTSPAS package as almost every trapping location experienced outages due to COVID-related shutdowns in the spring of 2020.

I appreciate your help. Take care.

Kale

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cschwarz-stat-sfu-ca/BTSPAS/issues/28, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABXIXRTRUSNTXSES6BXO5IDSU26AXANCNFSM4U3TJYBA .

kalebentley commented 3 years ago

Hi @cschwarz-stat-sfu-ca,

Great! Thanks for looking into this issue and fixing it at record speed. I will test out the updated version of the package in the next couple of days and report back any issues.

Kale

kalebentley commented 3 years ago

Hi @cschwarz-stat-sfu-ca,

I finally had some time to explore the BTSPAS functions after you made several updates last week. The change you made to the discrepancy statistic calculation for the non-diagonal now allows for missing maiden data (i.e., u2 = NA), which is great.

Upon re-running my various juvenile data sets, I came across a couple of other issues that I wanted to bring to your attention:

  1. As I mentioned before, manually amending individual periods of marks and recaptures in a dataset as "bad" by setting marks to 1 and recaptures as NA still seems to cause an issue with the non-diagonal Posterior Predictive function. This isn't an issue now that I know to leave marks and recaptures as 0, 0 but figured I'd highlight this nonetheless.

  2. The way u2copy is specified, the value can go negative. Interestingly, and I don't totally understand why, but it appears as though the diagonal model will run just fine with a negative u2copy value but the non-diagonal model will not. I ran into this issue with one of my data sets where I was trying to interpolate missed catch at the end of the run by adding several periods of NAs for u2 (similar to your vignette here). The error I got said, "Error in node epsilon[XX] Node inconsistent with parents" and the model would not run. I tried to duplicate the error with your demo data sets but couldn't. However, I may have figured out a potential solution to the perhaps rare issue. Your code creates a copy of u2 using the following code:

    • u2copy <- exp(spline(x = 1:length(u2), y = log(u2+1), xout = 1:length(u2))$y)-1
    • u2copy <- round(u2copy) If the spline generates a value less than roughly -0.7, the above code will result in a u2copy of -1, which again, seems to cause an issue with the non-diagonal model. I suggest changing the u2copy code to:
    • u2copy <- ceiling(exp(spline(x = 1:length(u2), y = log(u2+1), xout = 1:length(u2))$y))-1
  3. The last issue I ran into may be related to the second item listed above but, unfortunately, I haven't totally figured this one out. In short, when I go to add "extra" NAs at the end of the run, I sometimes get an error. For example, in one case, I added five NA u2 periods at the end of the dataset and the model ran five. I then added a sixth NA u2 period and the following error: image This error seems to be occurring somewhere in the "_fit" function after the model has finished and all of the plots have been generated. I tried to comb through the remaining code/functions but couldn't locate the exact location of the error (I don't see the function "grouped.data" being used). Anyhow, this isn't a huge deal but again figured I'd mention it if it potentially made any sense to you.

Thanks again for helping fix the previous issue with "bad" u2 data! I was able to run all of my 2020 abundance estimates using the updated BTSPAS package I downloaded off of GitHub.

Happy Holidays!

Kale

kalebentley commented 3 years ago

Hi @cschwarz-stat-sfu-ca, I wanted to check back in and see if you have had any time to read through the update I sent you on this issue #25 back in late December. If nothing else, I was wondering if you may consider updating how u2copy is calculated using my suggestion. If it would be helpful, I could submit a pull request.
Hope you are doing well. Thanks for your time and help. Kale

cschwarz-stat-sfu-ca commented 3 years ago

Hi Kale. sorry, I've been working 24/7 on a number of contracts that are due at the end of the Cdn fiscal year (31 March)... I haven't forgotten about it... Is this a critical issue, i.e. stopping you from proceeding?

Carl.

On Fri, Jan 29, 2021 at 11:58 AM Kale Bentley notifications@github.com wrote:

Hi @cschwarz-stat-sfu-ca https://github.com/cschwarz-stat-sfu-ca, I wanted to check back in and see if you have had any time to read through the update I sent you on this issue #25 https://github.com/cschwarz-stat-sfu-ca/BTSPAS/pull/25 back in late December. If nothing else, I was wondering if you may consider updating how u2copy is calculated using my suggestion. If it would be helpful, I could submit a pull request. Hope you are doing well. Thanks for your time and help. Kale

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cschwarz-stat-sfu-ca/BTSPAS/issues/28#issuecomment-770017857, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABXIXRVNJH5QBPNUMIC343DS4MHPJANCNFSM4U3TJYBA .

kalebentley commented 3 years ago

Hi @cschwarz-stat-sfu-ca, Thanks for the reply and absolutely no worries. This is not a critical issue but rather one that would be convenient to have fixed at some point in the future. In other words, not a huge rush. We were able to finalize all of our 2020 estimates late last year, and wouldn't you know it, 2021 trapping has already begun. Take care. Kale

kalebentley commented 2 years ago

Hi @cschwarz-stat-sfu-ca, I wanted to quickly check in again to see if you may have time to revisit the follow-up "issue" I submitted on Dec. 23, 2020. Our juvenile monitoring teams are starting to analyze the data collected this year and it would be helpful, if nothing else, to update how u2copy is calculated (using my suggestion) so that the initial values cannot be negative, which leads to the model not initiating.
Thanks so much.
Kale

cschwarz-stat-sfu-ca commented 2 years ago

Hi Kale.. thanks for the nudge...

I think I've patched BTSPAS to fix the problems you identified..

Upon re-running my various juvenile data sets, I came across a couple of other issues that I wanted to bring to your attention:

1.

As I mentioned before, manually amending individual periods of marks and recaptures in a dataset as "bad" by setting marks to 1 and recaptures as NA still seems to cause an issue with the non-diagonal Posterior Predictive function. This isn't an issue now that I know to leave marks and recaptures as 0, 0 but figured I'd highlight this nonetheless.

This has been patched. Try the attached script.

1.

The way u2copy is specified, the value can go negative. Interestingly, and I don't totally understand why, but it appears as though the diagonal model will run just fine with a negative u2copy value but the non-diagonal model will not. I ran into this issue with one of my data sets where I was trying to interpolate missed catch at the end of the run by adding several periods of NAs for u2 (similar to your vignette here https://cran.r-project.org/web/packages/BTSPAS/vignettes/f-Interpolating-run-earlier-and-later.html). The error I got said, "Error in node epsilon[XX] Node inconsistent with parents" and the model would not run. I tried to duplicate the error with your demo data sets but couldn't. However, I may have figured out a potential solution to the perhaps rare issue. Your code creates a copy of u2 using the following code:

This has been patched. I set the v2copy u2copy <- pmax(0, round(u2copy)) So you should not get negative values of u2copy anymore.

  1. The last issue I ran into may be related to the second item listed above but, unfortunately, I haven't totally figured this one out. In short, when I go to add "extra" NAs at the end of the run, I sometimes get an error. For example, in one case, I added five NA u2 periods at the end of the dataset and the model ran five. I then added a sixth NA u2 period and the following error: [image: image] https://user-images.githubusercontent.com/40397762/103031927-aa5f4e80-4513-11eb-8828-b0b431303500.png This error seems to be occurring somewhere in the "_fit" function after the runs and all of the plots have been generated. I tried to comb through the remaining code/functions but couldn't locate the exact location of the error (I don't see the function "grouped.data" being used). Anyhow, this isn't a huge deal but again figured I'd mention it if it potentially made any sense to you.

Sorry, but I was unable to re-create this problem. Do you have sample file where this happens? The grouped.data() function is used in the RunTiming routine. I suspect that some estimates of U are NA which upsets the function, but can't be sure and can't generate a case where this happens - sorry.

You can access the revised BTSPAS at the GitHub site. The version number is 2021.11.1 (the target date for uploading to CRAN).

I'm making some other (minor) cosmetic changes so will push the BTSPAS to CRAN later this month.

Please let me know if you run into any other problems.

Thanks for your continuing interest in BTSPAS

On Tue, Sep 21, 2021 at 3:13 PM Kale Bentley @.***> wrote:

Hi @cschwarz-stat-sfu-ca https://github.com/cschwarz-stat-sfu-ca, I wanted to quickly check in again to see if you may have time to revisit the follow-up "issue" I submitted on Dec. 23, 2020. Our juvenile monitoring teams are starting to analyze the data collected this year and it would be helpful, if nothing else, to update how u2copy is calculated (using my suggestion) so that the initial values cannot be negative, which leads to the model not initiating. Thanks so much. Kale

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cschwarz-stat-sfu-ca/BTSPAS/issues/28#issuecomment-924426677, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABXIXRVSVTZS4XSQLECSZCTUDD7RJANCNFSM4U3TJYBA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

kalebentley commented 2 years ago

Excellent. I will give the latest version of BTSPAS a try in the very near future.

As for the last issue I highlighted re: adding extra NAs to a dataset. In case I wasn't clear, the I am trying to implement is effectively what you showcase in your vignette "Interpolating run early and late". Enough time has passed since I ran into the problem that I don't exactly recall how I was specifying maidens, marks, and recaps in the "fake data". Nonetheless, I'll revisit and respond if I can duplicate the issue.

Thank you so much, Carl!

Kale