Closed katcatalano closed 4 years ago
Leaving this here so that these tables can be accessed if necessary.
Explanation of the issue, can also be found on slack: Explanation of problem: "My code uses fish_indiv to link observations, but assuming that a fish has been successfully genotyped at some point, it will have a gen_id and not NA (based on the meta data I thought that was the correct assumption). So, in the code I only consider observations of fish that have gen_ids (as in, they have been genotyped). So, if we observed and tagged a fish in 2015 as a juvenile, but that tissue sample wasn't successfully genotyped, and then we saw it again as a parent in 2016 but then it was successfully genotyped, the current fish_obs table links those observations by fish_indiv, but accordingly the first observation of the fish says "this fish was never genotyped", when it was, just after the first observation of the fish. That doesn't really match my reading of the meta data, which is why I think it would be helpful to clarify that, if we decide we don't want to link the fish by gen_id. The extra fish were a separate problem. The extra fish were added to a later fish_obs table, after I used the fish_obs table for parentage. So from that we had two problems 1) the fish_indiv numbers were reassigned for ~100 fish and so their size/sex/ext were not correctly matched because the fish_indiv numbers were inconsistent with the previous version, 2) There were additional fish added with newer iterations of the fish_obs table who were never included in the parentage analysis but should have been."
The fish_obs table meta data is now changed to clarify this. To see the 50 fish affected and get a table of with their observations completely linked by gen_id, see katcatalano/parentage@4c92561
I did this in my parentage repo here: https://github.com/katcatalano/parentage/commit/4c92561e6241a77d1129acca3d890e183da420cf
Figuring out the best way to get the new data on this repo, then will close this issue.