pacific-hake / hake-assessment

:zap: :fish: Build the assessment document using latex and knitr
MIT License
13 stars 6 forks source link

In Table 23, why is female spawning biomass not 1/2 of Total spawning biomass? #820

Closed cgrandin closed 2 years ago

cgrandin commented 3 years ago

Total Spawning biomass is from base.model$extra.mcmc$timeseries$Bio_all object

Female SSB is from the posteriors.sso file, SSB_XX where XX is the year and is divided by 2,000,000 (base.model$mcmc)

Age 2+ is coming from base.model$extra.mcmc$timeseries$Bio_smry object

I think that we need to investigate this table and the following one (CI of these values)

cgrandin commented 3 years ago

Also recruitment is not the same, i.e base.model$extra.mcmc$timeseries[-c(1,2),]$Recruit_0 / 1e6 does not even come close to base.model$mcmccalcs$rmed

aaronmberger-nwfsc commented 3 years ago

See Table 23 caption that says what Total biomass is (it is not limited to spawners). I think the reason Age-2+ biomass (males and females) is nearly double the female spawning biomass is b/c the age at 50% maturity is between 2 and 3. However, it is interesting that the female SSB in 2021 is 981 and total biomass is 1789 because we assume sex ratio is 50:50 so female SSB *2 (to include males) should be < total biomass (which is age-0 and up) but it is less??? It does seem weird.

aaronmberger-nwfsc commented 3 years ago

Regarding recruitment - are they in the same unit space? Is one numbers and one biomass?

cgrandin commented 3 years ago

The SS manual just says "Recruit-0 is the recruitment of age-o fish in this year" and rmed is in millions of fish

kellijohnson-NOAA commented 3 years ago

sorry, but where is mcmccalcs coming from?

cgrandin commented 3 years ago

mcmccalcs object is built in calc.mcmc() function in load-models.R Obviously wrote it years ago because I don't use apply() anymore

cgrandin commented 3 years ago

I know some extra mcmc posteriors are missing - look in the folder and see that the report files don't start until 22. This could be it but it still seems to be too much off for just that

aaronmberger-nwfsc commented 3 years ago

Recruitment as we report it in the executive summary is numbers. In Table 23 we query the TIME_SERIES section of SS report files (RECRUIT_0) and this is a biomass summary table I believe.

I know those time series summary tables can be tricky because they often include mid year and end yr values. Kelli, do you know about the reporting in there? It's almost like the TIME_SERIES section reports differently (like year-1 or year +1; kind of an end year versus beginning of year thing??

kellijohnson-NOAA commented 3 years ago

Recruitment is reported only for the season in which spawning takes place.

On Fri, Feb 19, 2021 at 1:00 PM Aaron Berger notifications@github.com wrote:

Recruitment as we report it in the executive summary is numbers. In Table 23 we query the TIME_SERIES section of SS report files (RECRUIT_0) and this is a biomass summary table I believe.

I know those time series summary tables can be tricky because they often include mid year and end yr values. Kelli, do you know about the reporting in there? It's almost like the TIME_SERIES section reports differently (like year-1 or year +1; kind of an end year versus beginning of year thing??

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/pacific-hake/hake-assessment/issues/820#issuecomment-782348134, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA7LCFDOUWBQ2R63Z4SBDC3S73GQNANCNFSM4X44XAQQ .

-- Kelli Faye Johnson, PhD Research Fish Biologist Northwest Fisheries Science Center National Marine Fisheries Service 2725 Montlake Boulevard East Seattle, Washington 98112 (206) 860-3490 kelli.johnson@noaa.gov

kellijohnson-NOAA commented 3 years ago

I am in a call with Ian and Rick if you want me to ask a specific question.

aaronmberger-nwfsc commented 3 years ago

I'm wondering if the TIME_SERIES section doesn't report biomass in the middle of the year, not the end/beginning of the year.

aaronmberger-nwfsc commented 3 years ago

I don't think it is a TIME_SERIES issue, but it does seem that the Table 23 is reporting everything related to females (so dividing all biomass stuff by 2), because values look somewhat reasonable for female SSB versus female total biomass versus female age-2+ biomass. It must be that because otherwise there just not enough biomass for males (i.e., in a no sex model 2x the female spawning biomass would be the male+female SSB, but there's not enough Total Biomass to even account for that part.

kellijohnson-NOAA commented 3 years ago

It says in the table Seas 1

On Fri, Feb 19, 2021 at 2:58 PM Aaron Berger notifications@github.com wrote:

I don't think it is a TIME_SERIES issue, but it does seem that the Table 23 is reporting everything related to females (so dividing all biomass stuff by 2), because values look somewhat reasonable for female SSB versus female total biomass versus female age-2+ biomass. It must be that because otherwise there just not enough biomass for males (i.e., in a no sex model 2x the female spawning biomass would be the male+female SSB, but there's not enough Total Biomass to even account for that part.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/pacific-hake/hake-assessment/issues/820#issuecomment-782431736, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA7LCFF45MLP5ES42LB2QM3S73UKLANCNFSM4X44XAQQ .

-- Kelli Faye Johnson, PhD Research Fish Biologist Northwest Fisheries Science Center National Marine Fisheries Service 2725 Montlake Boulevard East Seattle, Washington 98112 (206) 860-3490 kelli.johnson@noaa.gov

iantaylor-NOAA commented 3 years ago

@kellijohnson-NOAA pointed me to this discussion. I can't speak to the comparison of MCMC vs MLE, but I think some of the confusion on units and scaling by factor or 2 or not is related to a shift in how SS and r4ss deal with single-sex models which is discussed here: https://github.com/r4ss/r4ss/issues/248.

Here's the short version: in the past, any single sex model reported spawning biomass as the sum of fecundity * numbers at age for all sexes combined and the r4ss functions divided the SpawnBio outputs by 2 to represent female spawning biomass. More recent versions of SS have included the option for a Number of sexes = -1 where the output gets multiplied by the input sex ratio. With that option in place, the default in most r4ss (I think) is now to NOT apply a multiplier on the SSB values under the assumption that if users want everything divided by 2, they will set Nsexes = -1. image

For an empirical weight-at-age model like that used for hake, the spawning biomass be just the values in wtatage.ss associated with fleet = -2 multiplied by the numbers at age at the start of the year. And the total biomass (Bio_all) should be the fleet = 0 (beginning of the year) values from wtatage.ss multiplied by numbers at age, summed across all ages. Summary biomass should be those values summed for ages 2+ (or whichever age is specified in "min age for calc of summary biomass" in starter.ss).

iantaylor-NOAA commented 3 years ago

Also, this bit from the User Manual may or may not help. Recruits aren't listed, but fall into the same category as Numbers (values in Report.sso are in 1000s) image

andrew-edwards commented 3 years ago

Something changed, because the equivalent Table 25 in 2020 assessment seems correct:

2020 females spawning biomass = 1,196; total biomass = 2,916.

2021 assessment gives 2020 values as: 2020 females spawning biomass = 1,300; total biomass = 2,094.

which is obviously wrong if assuming a 50:50 sex ratio and a maturity ogive.

It's not a MLE v MCMC thing because medians in both assessments.

andrew-edwards commented 3 years ago

I do think we should fix this and maybe publish an Errata? We could just save the two corrected tables and get them added to the website. The numbers were correct in 2020, so it's due to a change since last year, probably what Ian said to do with SS and r4ss. Although I don't think we discuss total biomass anywhere in the assessment, we do mention their large role as predator and prey, and I think others may take the total biomass value and plug it into bigger ecosystem models. Seems best to get the correct numbers out there (and easier to do it soonish, rather than in six months time when we're working on other things).

Not sure exactly where to start. Luckily the rest of the JTC have been already assigned to this issue, ha ha.

Okay, a quick look.... Chris said that total spawning biomass comes from Bio_all in this object, where I'm just showing 2021, and the incorrect value in Table 23 is 1789110:

 base.model$extra.mcmc$timeseries[base.model$extra.mcmc$timeseries$Yr == 2021, ]
Area      Yr    Era  Seas Bio_all Bio_smry SpawnBio Recruit_0 SpawnBio_GP:1
    1  2021     FORE    1 1789110  1610230  1324450   1505100       1324450

SmryBio_SX:1_GP:1 SmryNum_SX:1_GP:1 sel(B):_1 dead(B):_1 retain(B):_1
    1610230           3263480          391681     391681       391681

sel(N):_1 dead(N):_1 retain(N):_1 obs_cat:_1        F:_1 SSB_vir_LH
   769679     769679       769679         NA    0.398117    1432830

ABC_buffer Bio_all.0.025 Bio_all.median Bio_all.0.975 Bio_smry.0.025
         1       1049650        2553530        6987787       929696.5

Bio_smry.median Bio_smry.0.975
        2284420        6265040

Given the true number should be over twice the female spawning biomass of 981,000, then either Bio_all.median or Bio_smy.median could be correct, but I don't want to guess. Not sure what Bio_all is, given there is also Bio_all.median.

Also note that SpawnBio here is 1,324,450, which is neither what we report as female spawning biomass (981,000 in the table) or a simple doubling of it. The table gets the spawning biomass from base.model$mcmccalcs$smed.

iantaylor-NOAA commented 3 years ago

One simple but time-consuming test you could do would be to change 1 #_Ngenders to -1 #_Ngenders on line 12 of hake_data.ss, then re-run the -mceval and extra mcmc calculations to see what changes in the output.

cgrandin commented 3 years ago

@aaronmberger-nwfsc or @iantaylor-NOAA

Is female spawning biomass (SSB_2021 in posteriors.sso) females age 2+ or females age 0+ ?

iantaylor-NOAA commented 3 years ago

As configured currently, SSB_2021 in posteriors.sso should be the product of the wtatage.sso vector associated with Yr = 2021 and fleet = -2 and the estimated numbers at age for that year. In contrast, the total biomass should be the product of the wtatage.sso vector associated with Yr = 2021 and fleet = 0. In both cases, I don't think there's any accounting for sex ratios (switching to nsexes = -1 would multiply the SSB_2021 value by 0.5), so any difference should be entirely due to the differences in the wtatage.ss vectors.

On Thu, Mar 11, 2021 at 1:24 PM Chris Grandin @.***> wrote:

@aaronmberger-nwfsc https://github.com/aaronmberger-nwfsc or @iantaylor-NOAA https://github.com/iantaylor-NOAA

Is female spawning biomass (SSB_2021 in posteriors.sso) females age 2+ or females age 0+ ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pacific-hake/hake-assessment/issues/820#issuecomment-797058862, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGC7FWFEX42D2P455TO6ETTDERHTANCNFSM4X44XAQQ .

cgrandin commented 3 years ago

Thanks @iantaylor-NOAA I am going to follow your suggestion above to try Ngenders = -1 and see if I can break it down.

andrew-edwards commented 3 years ago

Chris - I thought on the call the other day you implied it might have been the way you changed the table construction? i.e. in the table construction function, not necessarily an SS or r4SS change? Maybe I'm misremembering

cgrandin commented 3 years ago

@andrew-edwards I checked with code from a revision (commit) from last year, and the table code is exactly the same for extracting the values.

cgrandin commented 3 years ago

I've fixed the values in the R code. It turns out there was a silly line of code which made these two columns (Total biomass and Age 2+ biomass in table 23) be the MLE values. They are now correctly the medians calculated from the extra mcmc runs. While I was at it I cleaned up the functions and the documentation for them.

Due to the above commit, the assessment will no longer build unless you run build_rds('2021.00.04_base_v1') first, which takes about 30 minutes

The Total biomass values are still not double the Female spawning biomass values, probably due to the other SS issues above. The values are actually further away from being double than they were before!

kellijohnson-NOAA commented 3 years ago

The total biomass (all ages) or summary biomass (age 2+) values will never be 2x the female spawning biomass because not all fish are mature to spawn, noted by both Ian and Aaron above. Weight-at-age for fleet 0 vs -2 can show you how much they would differ.

cgrandin commented 3 years ago

@kellijohnson-NOAA for some reason I figured total biomass meant total spawning biomass. But it all makes sense when non-spawners are included. I may see if those numbers work out.

andrew-edwards commented 3 years ago

@cgrandin - I did build_rds('2021.00.04_base_v1') but get

Error: There is a discrepancy between what you have set for the catch.levels names 
 and what appears in the forecasts directory 'C:/andy18/github/hake-assessment/models/2021.00.04_base_v1/forecasts/forecast-year-2021'. 
 Check the names in both and try again.

I think you uploaded the .rds anyway, but thought I'd mention this. I guess the iterated catch levels aren't exact between our computers.

kellijohnson-NOAA commented 2 years ago

Reopening because we are going to change the data file to use -1 for "Ngenders" or the number of sexes in the model. Note that the SS3 team is working on renaming.

aaronmberger-nwfsc commented 2 years ago

Comments on March 13 indicate this is fixed and we know have a modeling workflow that confirms -1 for "Ngenders" will result in SSB being just mature female biomass as intended.

I'll leave this open for now. @cgrandin can close once the /2 have all been removed from our hake document build code.

andrew-edwards commented 2 years ago

See also discussion in #878 and 29600a9. Plan is to fix the /2 throughout to new SS format, but then tweak values last.yr.base.model (2021 base model) after they are loaded in, so that bridging figures and Table 25 work okay.

andrew-edwards commented 2 years ago

So the mcmccalcs are being saved wrong, but they were correct last year:

> base.model$mcmccalcs$smed[c(1966, "unfished")]
    1966 unfished
0.429577 0.903665
> last.yr.base.model$mcmccalcs$smed[c(1966, "unfished")]
    1966 unfished
 0.83164  1.65773

Yet the $mcmc values are correct this year, but were double last year:

> median(base.model$mcmc$`SSB_Initial`) / 1e6
[1] 1.80733
> median(last.yr.base.model$mcmc$`SSB_Initial`) / 1e6
[1] 3.31546
andrew-edwards commented 2 years ago

Am fixing the old extra /2 values, and also fixing build_rds(). We'll have to rebuild the .rds files at some point - no rush as may have to anyway.

andrew-edwards commented 2 years ago

See 9bfca8eb and 2a49a8c1 - I referenced the wrong issue in commit message.

andrew-edwards commented 2 years ago

Need to rebuild all .rds files at some point, except last.yr.base.model:

This should automatically fix mcmccals and thus (check by comparing to last year's):

Above commits have fixed Table 25 (except now need to tweak 2021 base model results) and Tables 33-36; B0 and B_40% were all wrong.

andrew-edwards commented 2 years ago

This is fixed with new .rds files #887. And will be good for the future, as there's a /2 fudge that will only apply to a model with the name of the 2021 base model.