tmcd82070 / CAMP_RST

R code for CAMP rotary screw trap platform
1 stars 1 forks source link

Mokelumne River CAMP.mdb producing new, previously unseen error message #47

Closed jasmyace closed 9 years ago

jasmyace commented 9 years ago

Issue #2: the Mokelumne River CAMP.mdb is producing the error message provided in the screenshot below; Connie and I have not seen this error message before. To avoid the problem identified in Issue #1, and before attempting to run a report that would result in Issue 2, I imported the temp tables into the Mokelumne River CAMP.mdb.

Using the same criteria for the same site and with a 2004/2005 field season I get no error, but with a 2005/2006 field season I get the error below. The error also appears with all the field seasons after 2005/2006 for the Golf site on the Mokelumne River. Connie has checked the post-2004/2005 data and she has not been able to find a data issue that would appear to need attention. Please ID the issue causing this kind of abort. If it appears to be a data-related issue, call Connie to discuss any remedy she may need to deal with. If it is an R issue, please rectify it. image image

jasmyace commented 9 years ago

run_R_Moke error.txt

tmcd82070 commented 9 years ago

It seems that the R code is expecting a fork length during some visit, but did not find one. I base this on: Error in $<-.data.frame(*tmp*, "forkLength", value = NA) : replacement has 1 row, data has 0 Calls: F.run.passage -> F.summarize.fish.visit -> $<- -> $<-.data.frame Here, R is attempting to assign fork length, but there is not fork length to assign.

I note that several NAs appear in forkLength column of the catch table

8 Fall Fry NA Yes Fry 19 Fall Fry 30 Yes Fry 21 Fall Fry 35 Yes Fry 22 Fall Fry NA Yes Fry 25 Fall Fry 35 Yes Fry 27 Fall Fry NA Yes Fry 28 Fall Fry 35 Yes Fry

Tasks:

tmcd82070 commented 9 years ago

Connie's email from 09 Sep 15

In this dataset the life stage and run are consistently assigned however the fork lengths are sometimes missing. This will happen in other watersheds as well. The 2006 test I ran on the Golf trap included three dates when only one Chinook was caught. In each case the fish was assigned a run and life stage but not measured. I don’t think this is the problem since 2005 and 2012 also have similar records and were run successfully. Unless some, but not all, of this type of record cannot be resolved by the code.

Jason found an unidentified error while running tests. His comment contains key words found in the original error “fork length” and “NA”. I don’t understand the comment but he handled it here in the summarize_fish_visit.r file line 20.

# Jason add - first pass assumed that unassigned fork length is NA. but...not NA for one day in run I’m investigating. force this. catch[catch$Unassd == 'Unassigned',]$forkLength <- NA

When I comment out line 20 in this file the analyses seems to run smoothly until the end when it tries to summarize passage. The new error is:

Error in F.est.passage(catch.df.ls, release.df, "year", out.fn.root, ci) : Issue with summation of assignedCatch, unassignedCatch, inflatedCatch, imputedCatch, and/or totalCatch. Investigate est_passage.R, around line 176. Calls: F.run.passage -> F.est.passage Execution halted

In regards to this second error, is it possible that the code considers these fish as “assigned” in one area (both run and life stage are assigned) and “unassigned” (no fork length present) in another and so the tally comes out wrong causing the code to fail?

I put the Mokelumne database on the Camp_Admin site in For Jason folder. They do some other types of trapping so just run locations with the “RST” in the site name. Golf is the one that has both successes and failures and it only has one subsite.

I also put the most recent version of the QC application in there. The reports view all Chinook and Chinook crosstab will help you figure out what the differences might be. I didn’t see anything unusual. The first time you open the QC application you will need to re-establish a link to the CAMP.mdb. A window should come up prompting you to do this. If it doesn’t just use the Check or Change Links button at the bottom of the main form.

jasmyace commented 9 years ago

Connie previously noted that this is tied to fish whose lifeStage and Run are assigned, but for which measurement is not obtained. I feel like this is a rare occurrence; however, given the volume of fish we've historically processed, I find it strange that this particular bug has not arisen before.

I first made a condition that checks for Unassigned fish smarter by only applying if there are data records with values of "Unassigned," based on the criterion in lines 20-22 of program summarize.fish.visit. Dealing with this got rid of the error reported...which was fine, until the accounting error message appeared, indicating erroneous tabulation of (in this case) assigned versus unassigned fish.

I then updated line 117/118 of program summarize.fish.visit to deal with this new issue. In this step, the unassigned counts are parsed out following processing, i.e., counting, getting their mean, etc. The criterion for this was to pull those records for which the calculated mean was NaN -- in this case, this identifies unassigned fish, based on the processing in this program. However, for the fish encountered here, both the lifeStage and Run are always something other than "Unassigned;" so, these are actually assigned fish. This means that this part of the program tabulates these fish as both unassigned here, and assigned a little bit later -- this is what triggered the fish-accounting error -- we're double counting. So, in addition to ID-ing fish based in this NaN value, I also updated it to only keep fish where either (or both) of FinalRun and lifeStage are =="Unassigned." This ensures that these assigned fish don't get tallied as unassigned, and allowed the run.passage program to complete as expected (from within R).

While this update fixed this particular issue, we will need to be very careful about monitoring its effect on other rivers/runs. While I don't expect this change to have adverse effects elsewhere, we will need to keep this Issue open until after we do the Big Testing Run again, just to be sure.

jasmyace commented 9 years ago

mokelumne_testing_2005-12-01_2006-07-30fall_catch mokelumne_testing_2005-12-01_2006-07-30fall_eff Mokelumne_testing.xlsx

tmcd82070 commented 9 years ago

What is the difference between the red line, and the open circles in the first graph? Jason and Trent need to jog Trent's memory here.

On Sat, Oct 3, 2015 at 4:39 PM, Jason notifications@github.com wrote:

[image: mokelumne_testing_2005-12-01_2006-07-30fall_catch] https://cloud.githubusercontent.com/assets/9057972/10265430/478d6e7e-69ed-11e5-9d47-b25cf938ed97.png [image: mokelumne_testing_2005-12-01_2006-07-30fall_eff] https://cloud.githubusercontent.com/assets/9057972/10265429/478cb77c-69ed-11e5-93e6-4ea8dfa79558.png Mokelumne_testing.xlsx https://github.com/tmcd82070/CAMP_RST/files/7235/Mokelumne_testing.xlsx

— Reply to this email directly or view it on GitHub https://github.com/tmcd82070/CAMP_RST/issues/47#issuecomment-145295740.

DWH ATTORNEY WORK PRODUCT / ATTORNEY-CLIENT COMMUNICATIONS

Trent McDonald, PhD Senior Statistician

Environmental & Statistical Consultants 200 S. Second Street Laramie, WY 82070 (307) 721-3172 (307) 760-4721 Cell tmcdonald@west-inc.com www.west-inc.com

Follow WEST: Facebook http://www.facebook.com/pages/Western%E2%80%90EcoSystems%E2%80%90Technology%E2%80%90WESTInc/125604770807646 , Twitter http://twitter.com/WestEcoSystems, Linked In http://www.linkedin.com/company/1458419, Join our Mailing list http://visitor.r20.constantcontact.com/manage/optin/ea?v=001qrD4A3S5xJ5KgMyelH9jyw%3D%3D

CONFIDENTIALITY NOTICE: This message and any accompanying communications are covered by the Electronic Communications Privacy Act, 18 U.S.C. §§ 2510-2521, and contain information that is privileged, confidential or otherwise protected from disclosure. If you are not the intended recipient or an agent responsible for delivering the communication to the intended recipient, you are hereby notified that you have received this communication in error. Dissemination, distribution or copying of this e-mail or the information herein by anyone other than the intended recipient, or an employee or agent responsible for delivering the message to the intended recipient, is prohibited. If you have received this communication in error, please notify us immediately by e-mail and delete the original message. Thank you.

P Please consider the environment before printing.

jasmyace commented 9 years ago

I hope to address Trent's question from 6 days ago shortly.

In other news... How far in time does the Mokelumne database go? I set up the automatic big-run loop to process a 12/1/2012-7/30/2013 run and a 12/1/2013-7/30/2014 run (in addition to several other runs starting with 2005), but I got an error in the R program that says there are no trap visits. Not quite sure where to look for this in the raw database tables, so thought I would check.

If these are not legit dates, then it looks like my fix (of my previous fix, which turned out to not work in all cases) may be complete.

jasmyace commented 9 years ago

I currently make the super smoother (the function that makes the red line) iterate through the total catch. This is in line with what it's always been, although long ago, this was simply called catch. I was able to run, for Fall, these data, using the set of code that was in place before I started tinkering with it, and obtained the exact same total catch, via cell-by-cell comparison of the "totalCatch" column (what we currently call it), with that of just "catch," (or what we used to call it). Practically, this means the placement of the dots coincides with both versions.

The difference in the smoother is due to my not including totalCatch data points that failed to have any measured fish on a particular day. I believe this is an artifact from when we were tinkering around with what the plot shows. At one point, we were considering plotting only measured fish, so as to ensure the y-axis descriptor matches with what is actually plotting. I suspect this measured-day restriction was perhaps part of that upgrade (which was then subsequently rescinded). Removing this restriction (which Trent never had) should make the two graphs (and their super-smoothed lines) coincide. That is the reason for the discrepancy in the super-smoothed lines.

However, the super-smoothed line does go through the total catch of fish. Do you want the super-smoother line to trend through a different set of data, e.g., the imputed points instead?

passage_golf_2015-10-12_16-08-09_fall_catch

ConnieShannon commented 9 years ago

Jason. Regarding "how far back do the Mokelumne data go." I recommend using the QC application to check which trapping locations are being used as well as for the presence of efficiency tests. I put a copy in your ForJason folder in the Admin_ftp a while ago.

The application will link to any CAMP backend with any name or location. Use the button Check or Change Links at the bottom of the main form to link to the database you want to see.

No criteria are required in the main form. For your purposes you would leave these fields null and then open either of the following reports. Use the Trap Visit Crosstab and Efficiency Test Crosstab reports to find data to run.

As with many of the RST programs, the location of the traps in the Mokelumne has changed over the years so looking at the data in the QC application can be very helpful. The most recent data in the Mokelumne database is June 2012.

There are also other types of trapping in the Mokelumne data. Maybe just run Golf from fall 2004 through July 2012.

jasmyace commented 9 years ago

Okay thanks. I will take a look.

So far, outside of the negative minutes recorded in a different Issue, the Big Loop program has been running without error. So, with luck, we'll be able to release the update tomorrow.

jasmyace commented 9 years ago

I've run the Big Loop enough, without error, to feel comfortable that this Issue is resolved.