thefaylab / sseep-analysis

0 stars 0 forks source link

Turn stratified mean & variance calculations in the sdmTMB folder into functions for easy re-use. #8

Open gavinfay opened 1 year ago

gavinfay commented 1 year ago

For easier reuse across stocks and model fits, take simulation application of stratigeid mean & var calcs and create a function that does the same thing.

AngeliaMiller commented 1 year ago

@gavinfay I attempted a go at this, see commit 4b26453, not sure if it is quite as efficient/clean as we would like. It also does not add the calculation type at the moment (i.e "With Wind Included, With Wind Precluded") because I am not sure how that condition will work with the simulated datasets currently but wanted to get your eyes on this for the time being.

AngeliaMiller commented 1 year ago

revisit to make more flexible with column calls to allow use regardless of columns in dataframe.

AngeliaMiller commented 7 months ago

@gavinfay,

I adjusted the stratified mean function to address the same issue you noted in the sseep-sim issue #14. It can be found with commit cf4ff1c.

I also attempted to add a function to calculate the mean relative percent difference since that is being used quite a bit through the repo starting in sseep-analysis/tree/analysis-testing/retro-analysis/03-strat-mu-diff.R; the addition can be found in the same commit above.

However, when I couple these function changes and apply them to the historical data to consolidate the scripts found in seep-analysis/tree/analysis-testing/retro-analysis (efforts for consolidation found in retro-analysis-script.R), the magnitude of change in indices is quite large for some combination of species and season. For example, the magnitude of change for spiny dogfish in the spring is an order of 10^5; much greater than a 100 percent change.

Would you mind checking my mean.diff() in StratMeanFXs_v2.R when you get a chance? I've loaded the spatially-filtered datasets that are called into retro-analysis-script.R.

Also to note, some of the estimates of abundance are a little over double when we make the change in the stratified mean function (see below).

When we use the full survey area in the stratified mean calculation:

When we use the total area of only the strata constituting the 95% cumulative biomass in the stratified mean calculation:

gavinfay commented 7 months ago

@AngeliaMiller Wrong commit # reference I think.

For both these, suggest using advice from coursework - design tests for your function(s) using small toy datasets when you know what the outputs should be.

When you say

some of the estimates of abundance are a little over double when we make the change in the stratified mean function (see below).

I assume you are referring to the magnitude of the mean catch rate? I see very small differences between the plots other than the y axis scale. This (increase in the mean) is not unexpected given you are eliminating large numbers of tows with a catch of zero in the revised calculations.

gavinfay commented 7 months ago

For the dogfish example you mention, I assume this is due to a very low (close to zero) mean catch rate in one year (even a small change is going to be a very large relative difference when the reference value is small - small number divided by a very very small number is going to be large).