pfmc-assessments / PacFIN.Utilities

R code to manipulate data from the PacFIN database for assessments
http://pfmc-assessments.github.io/PacFIN.Utilities
Other
7 stars 1 forks source link

Missing age3 - changed to pacfin doesn't have all double read data #35

Closed chantelwetzel-noaa closed 1 year ago

chantelwetzel-noaa commented 3 years ago

There used to be an age3 column for a third otolith read. This column is not in the new bds pull. Does this no longer exist? I suspect it rarely had information in this column. If it is longer provided then we need to update the cleanPacFIN function to no longer look for reads in the age3 column.

kellijohnson-NOAA commented 3 years ago

It is only generated if there are three ages of a fish. The code allows for a dynamic generation of the age columns, where age5 would be allowed if one fish were aged 5 times. Thanks for the heads up about the downstream effects of this. I haven't looked into them yet. I will modify cleanPacFIN() as suggested.

mhaltuch commented 3 years ago

Hello again, I think that we should probably still be pulling the age3 column. This was used by WDFW to denote petrale samples that had two surface reads and one break and burn read (age3). Perhaps this work around for the PacFIN data structure has been corrected in recent years. I'm not sure if this column was used for other species. In my experience the data here were only from WDFW. It would be worth going back to Teresa Tsou to ask.

Cheers, Melissa

On Thu, Nov 5, 2020 at 5:03 PM Chantel Wetzel notifications@github.com wrote:

There used to be an age3 column for a third otolith read. This column is not in the new bds pull. Does this no longer exist? I suspect it rarely had information in this column. If it is longer provided then we need to update the cleanPacFIN function to no longer look for reads in the age3 column.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/nwfsc-assess/PacFIN.Utilities/issues/35, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACZOPK4WUSHUN6ORNZEN45LSONDORANCNFSM4TMCCRCA .

-- Cheers, Melissa

kellijohnson-NOAA commented 3 years ago

Thanks @mhaltuch for the comments. For clarification, there has never been an age3 column in PacFIN to pull. This column is generated by the code that pulls data. A single fish can have multiple rows of data in PacFIN, because it was read more than one time. We create a new column to condense the fish down to a single row of data. The new code only produces the age3 column if there are three reads of at least one fish for the given species of interest.

The benefit of the new code is that it will create as many columns as it needs. So, if there are six reads of a fish it will produce age1, age2, age3, age4, age5, age6. I thought this would be better for those looking at ageing error. Please correct me if I am wrong and if we don't want information of reads beyond the 3rd.

mhaltuch commented 3 years ago

This sounds great Kelli, we do want to keep all of the age reads!

m

On Fri, Nov 6, 2020 at 4:07 AM Kelli Johnson notifications@github.com wrote:

Thanks @mhaltuch https://github.com/mhaltuch for the comments. For clarification, there has never been an age3 column in PacFIN to pull. This column is generated by the code that pulls data. A single fish can have multiple rows of data in PacFIN, because it was read more than one time. We create a new column to condense the fish down to a single row of data. The new code only produces the age3 column if there are three reads of at least one fish for the given species of interest.

The benefit of the new code is that it will create as many columns as it needs. So, if there are six reads of a fish it will produce age1, age2, age3, age4, age5, age6. I thought this would be better for those looking at ageing error. Please correct me if I am wrong and if we don't want information of reads beyond the 3rd.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nwfsc-assess/PacFIN.Utilities/issues/35#issuecomment-723046467, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACZOPKYEPRUQH3YYIO7MZSLSOPRGJANCNFSM4TMCCRCA .

-- Cheers, Melissa

kellijohnson-NOAA commented 3 years ago

@mhaltuch mentioned that having the lab per read would be good information as well!

kellijohnson-NOAA commented 3 years ago

All information by age read is now separated into columns with a separator of "." followed by the number that was in AGE_SEQUENCE_NUMBER. This is done by SAMPLE_YEAR and FISH_ID. See notes in PullBDS.PacFIN and the code within that function. In theory, we no longer have to ask for age reads b/c the information is contained in the bds pull.

Patrick noted that he doesn't send all double reads to the state-specific data bases. I will work on getting these to pacfin. kellijohnson-NOAA sent reminder email on 2022-05-04.