pfmc-assessments / PacFIN.Utilities

R code to manipulate data from the PacFIN database for assessments
http://pfmc-assessments.github.io/PacFIN.Utilities
Other
7 stars 1 forks source link

Add flag for years with large proportions of unsexed fish #50

Open chantelwetzel-noaa opened 3 years ago

chantelwetzel-noaa commented 3 years ago

I think there should be a flag somewhere if a large proportion of samples from a single year are of unsexed fish. Users may want to treat those data in a different way. I encountered an issue where samples from early years were all for unsexed fish and I was not paying attention. I ran the getComps function and then doSexRatio which split the data to a sex. I spent a whole afternoon trying to sort out why these select years had composition data that looked so strange.

I think a flag would have helped me identify the issue easier. Additionally, it would be nice to have the functionality to not assign a sex via doSexRatio and have the writeComps function spit out unsexed composition data.

kellijohnson-NOAA commented 1 month ago

@chantelwetzel-noaa Thanks for the idea, sorry I moved it continually from milestone to milestone 🤕. I am wondering if we should get rid of doSexRatio() all together. Or at least that is the feeling that I have. I think that unsexed fish should remain unsexed and be put into the model that way rather than making up data.

brianlangseth-NOAA commented 1 month ago

Some species have few unsexed fish compared to sexed, and vice versa. Although dropping the unsexed fish samples is an option, I dont particularly like that, so Ive used the option to group the unsexed fish into sexed fish knowing that the choice isn't overly impactful on the model and it simplifies the data structure. Thus I am in favor of not getting rid of doSexRatio() but allowing functionality to not assign a sex.

iantaylor-NOAA commented 1 month ago

There's additional discussion about this on this flowchart: https://docs.google.com/presentation/d/1RnfNzP6Nlyp_b4W2yZjbmBrvm5jUe0L8OJxO633dH1k/edit?pli=1#slide=id.g12ab1bd588c_1_5.

I'm convinced that keeping unsexed fish separate or ignoring them (depending on the proportion) are the better ways to go, but I support Brian's wish to give the author the option to partition them into the sexed comps. I don't see that partitioning being fundamentally different than the other expansions that we do.

chantelwetzel-noaa commented 1 month ago

After a couple of years of thinking about this, I am supportive of eliminating sex ratio functionality. However, as evidenced by @brianlangseth-NOAA comment above not everyone is on the same page. I think the other option would be to create a universal sex ratio function that users outside of the composition could call functions for both survey and commercial data but with the default functionality of the composition functions to return composition data for all sexes in the data passed to the function. If we go this route I don't think we would need to have a flag so this specific issue could be dismissed.

iantaylor-NOAA commented 1 month ago

@chantelwetzel-noaa, I like your solution, which would also get rid of this problem you described in a comment on the flowchart: "In the survey data, there is an additional level of complexity here. There are 2 option of how to apply the sex ratio 1) apply the sex ratio observed in the same tow to assign sexes or 2) apply the sex ratio across all years observed in the length bin."

However, if we think that partitioning the unsexed fish is going to be phased out over time, is it worth investing the time in building a new function? I'm guessing it would only be worthwhile if doSexRatio() and the equivalent in {nwfscSurvey} require additional maintenance to keep around.

kellijohnson-NOAA commented 1 month ago

I think that the best option via the discussion above is to remove anything related to assigning something a sex if it is unsexed from within PacFIN.Utilities and allow users to assign unsexed fish a sex if they want ahead of time. This could be via some potential future function or user-specific code. The code will always remain in the git history that way if we want to look at it later, it will still be available. I would suggest that a new function, if made, is placed in PEP tools rather than within nwfscSurvey or PacFIN.Utilities.

I plan on HARD deprecating this function doSexRatio() next week. Please let me know as soon as possible with a thumbs down to this comment if this does not work for you.

brianlangseth-NOAA commented 1 month ago

Im struggling to understand all the implication of this. Im going to assume that if doSexRatio() is deprecated then write_comps will be modified to output unsexed fish as that remains the second of the two elements suggested in this issue originally. Thus the main effect would be that there is no current functionality to combine unsexed fish into sexed groups. Based on this discussion, that doesn't happen much, but is it a lot of work to move doSexRatio() into PEPtools as opposed to fully deprecate so that the functionality is maintained?

kellijohnson-NOAA commented 1 month ago

Unsexed fish will be written as unsexed fish. It is not a lot of work to move doSexRatio() to PEPtools but whoever does so would need to fix it to work with datasets other than PacFIN and agree to maintain it there.

iantaylor-NOAA commented 1 month ago

I took a look at all 16 instances of doSexRatio( on github: https://github.com/search?q=doSexRatio%28&type=code.

The three instances where it was not commented out are

There are surely many older assessments not on github that used the function, but if someone wants to reproduce them they would likely have to install an older version of PacFIN.Utilities anyway.

Since @kellijohnson-NOAA does 99% of the work developing and maintaining PacFIN.Utilities, I'm OK going along with whatever she wants. I think that "hard deprecating" means that it won't work. But if someone wants to use it, they can copy the code from PacFIN.Utilities (from git history if necessary) into the code for their assessment and use it or modify it however they wish.

kellijohnson-NOAA commented 1 month ago

Thanks @iantaylor-NOAA for doing that background work. And, yes you are right, I plan on essentially removing the function from the package and giving the user an error saying that it is no longer available. But, yes anyone can always copy the code from the git history.

iantaylor-NOAA commented 1 month ago

To make things extra easy, here's the permanent link to the function prior to deprecation: https://github.com/pfmc-assessments/PacFIN.Utilities/blob/19209f0e8e8aa259d513a0d944df703b3ac4cef6/R/doSexRatio.R. It might no longer work with future revisions to PacFIN.Utilities, but it's not a long or complex function so probably could be fixed without too much work. At this point we've surely collectively spent more time thinking about unsexed fish than the impact of those fish on the assessments warrant.