nmfs-ost / ss3-source-code

The source code for Stock Synthesis (SS3).
https://nmfs-ost.github.io/ss3-website/
Creative Commons Zero v1.0 Universal
38 stars 16 forks source link

[Bug]: rho parameters not impacting semi-parametric (2D-AR) selectivity when multiple sigma values are used #615

Closed iantaylor-NOAA closed 2 months ago

iantaylor-NOAA commented 3 months ago

Describe the bug

This issue was found in discussion with @rosana-ourens who successfully set up a model with semi-parametric selectivity where there were multiple sigma values to allow more flexible deviations for the youngest ages. However, the rho parameters did not seem to have any impact on the results.

I replicated the issue in the attached models, modified from https://github.com/nmfs-ost/ss3-test-models/tree/main/models/Hake_2019_semi-parametric_selex, which have a similar set up where sigma_amax = 3 and rho_year and rho_AGE are both fixed at either 0 or 0.5: sigma_3_rho_0.zip sigma_3_rho_0.5.zip

The likelihoods differ between models but the difference is all due to the Parm_devs component as shown in the comparison below. image

Changing sigma_amax from 3 to 0 and reducing from 3 to 1 sigma_sel parameters, allows the rho parameter to have a modest impact on the selectivity parameters and all other aspects of the dynamics.

This may be due to my misunderstanding of how the rho parameters work, but I was assuming that they apply across all ages and years of deviations regardless of the how the sigmas are configured.

To Reproduce

Run the models attached above.

Expected behavior

Different rho values impact the model results beyond just the parameter deviations.

Screenshots

No response

Which OS are you seeing the problem on?

No response

Which version of SS3 are you seeing the problem on?

3.30.22.1

Additional Context

No response

Rick-Methot-NOAA commented 3 months ago

I see in echoinput that the determinant for 2D_AR cor: 1 is: 1.839e-30

So, we may need to consult @Haikun on this.

The code is here: https://github.com/nmfs-ost/ss3-source-code/blob/f2df4f49c177e718ad4ac8aa13aabe4adb26102a/SS_prelim.tpl#L1020

I added some extra output and the individual calcs seem correct: fleet: 1 2D_AR rho in prelim for time and age/size 0.5 0.5

i 1991 j 1 m 1991 n 1  j-n 0 pow(a) 1 i-m 0 pow(y) 1
i 1991 j 1 m 1991 n 2  j-n 1 pow(a) 0.5 i-m 0 pow(y) 1
i 1991 j 1 m 1991 n 3  j-n 2 pow(a) 0.25 i-m 0 pow(y) 1
i 1991 j 1 m 1991 n 4  j-n 3 pow(a) 0.125 i-m 0 pow(y) 1
i 1991 j 1 m 1991 n 5  j-n 4 pow(a) 0.0625 i-m 0 pow(y) 1
i 1991 j 1 m 1992 n 1  j-n 0 pow(a) 1 i-m 1 pow(y) 0.5
i 1991 j 1 m 1992 n 2  j-n 1 pow(a) 0.5 i-m 1 pow(y) 0.5
i 1991 j 1 m 1992 n 3  j-n 2 pow(a) 0.25 i-m 1 pow(y) 0.5
i 1991 j 1 m 1992 n 4  j-n 3 pow(a) 0.125 i-m 1 pow(y) 0.5
i 1991 j 1 m 1992 n 5  j-n 4 pow(a) 0.0625 i-m 1 pow(y) 0.5
i 1991 j 1 m 1993 n 1  j-n 0 pow(a) 1 i-m 2 pow(y) 0.25
i 1991 j 1 m 1993 n 2  j-n 1 pow(a) 0.5 i-m 2 pow(y) 0.25
i 1991 j 1 m 1993 n 3  j-n 2 pow(a) 0.25 i-m 2 pow(y) 0.25
i 1991 j 1 m 1993 n 4  j-n 3 pow(a) 0.125 i-m 2 pow(y) 0.25
i 1991 j 1 m 1993 n 5  j-n 4 pow(a) 0.0625 i-m 2 pow(y) 0.25
i 1991 j 1 m 1994 n 1  j-n 0 pow(a) 1 i-m 3 pow(y) 0.125
i 1991 j 1 m 1994 n 2  j-n 1 pow(a) 0.5 i-m 3 pow(y) 0.125
i 1991 j 1 m 1994 n 3  j-n 2 pow(a) 0.25 i-m 3 pow(y) 0.125
i 1991 j 1 m 1994 n 4  j-n 3 pow(a) 0.125 i-m 3 pow(y) 0.125
i 1991 j 1 m 1994 n 5  j-n 4 pow(a) 0.0625 i-m 3 pow(y) 0.125
iantaylor-NOAA commented 3 months ago

@HaikunXu, let us know if you have time to look at the code in https://github.com/nmfs-ost/ss3-source-code/blob/3191eb82b6aabb27abd4e500744af4b69cc2b821/SS_prelim.tpl#L1020 and let us know if you see anything wrong related to the implementation of rho parameters when there are multiple sigma values? This isn't time sensitive, but would be good to resolve eventually.

HaikunXu commented 3 months ago

Hi Ian and Rick,

Sorry for the late reply. I have not tested a case with multiple sigma values before. I don't see any bug in that part of the code (specify correlation matrix) so I guess there may be a bug in where the associated likelihood component is computed. Both the correlation matrix and sigma are used to compute the associated likelihood so I would appreciate if you can point to me where that section is so I can take a look at it.

iantaylor-NOAA commented 3 months ago

@HaikunXu, thanks for taking a look at the code. The likelihood seems to be calculated here: https://github.com/nmfs-ost/ss3-source-code/blob/main/SS_objfunc.tpl#L967-L1013

HaikunXu commented 3 months ago

Thank you @iantaylor-NOAA. I understand why rho has no impact. When using age-specific signamsel, the likelihood is computed based on rho = 0 regardless of rho input (see how rho impacts the likelihood in lines 977 and 982 when there is one sigmasel). Two scenarios (rho = 0 or not) need to be coded for multiple sigmasels between line 988 and 1012. Also, I think there may be a bug in the computation of likelihood for multiple sigmasels: line 999 should be identical to line 980/985

Rick-Methot-NOAA commented 3 months ago

Thank you very much Haikun for looking at that portion of the code. I will make the fix when I return next week unless one of you beats me to it. Now that I look at the objfun code I see that it says: // ignore rho for now; need indexing for inv_cor()

So using rho with age-specific sigmasel may be significant work. Perhaps the indexing problem is also what is causing the determinant for 2D_AR cor: 1 is: 1.839e-30

Rick-Methot-NOAA commented 2 months ago

The age-specific sigmasel code seems at least incomplete and probably incorrect. It was done in ~2018 by me after Haikun and Jim developed the original 2dAR without age-specific sigmasel. Ian tried 2dar for hake but perhaps not in age-specific option, so I cannot find evidence that age-specific ever worked correctly. At this time, I think we should deprecate it. @haikunxu @iantaylor-noaa

HaikunXu commented 2 months ago

For the near term, the most we can do is testing the age-specific option with no rhos (we can specify several identical sigma_sels and compare the likelihood with that from the same model with only one sigma_sels and no rhos; if the code works then the two likelihoods should be identical). So we should deprecate at least the age-specific option with none-0 rhos because the likelihood portion is not coded for none-0 rhos.

I am not 100% sure (I have forgotten most of the linear algebra knowledge I learned in college) but we may need to code the covariance matrix to make it work. I see for one sigma_sel case the likelihood for selex devs is based on inv_cor (inverse of the correlation matrix for selex devs) and signa_sel, which may not work for age-specific sigma_sels case and we may need to directly compute the inverse of the covariance matrix for selex devs. This may not be easy to do in the near term.

Rick-Methot-NOAA commented 2 months ago

Thanks Haikun. I am 100% in agreement with the test that could be done now, and with the likely difficulty in redoing the code.

HaikunXu commented 2 months ago

Based on Ian's hake example, I tested the age-specific sigma_sel option with several identical sigma_sels and found that it provides the same results and likelihoods as using the single sigma_sel option. They can still use age-specific sigma_sel, but for now we need to deprecate using user-specified rhos for this option because the associated likelihood component has not been coded and tested.

Rick-Methot-NOAA commented 2 months ago

I have created a branch and will working on the code and also a revision to the manual. @HaikunXu Since you have the example ready, can you try letting SS3 estimate the age-specific sigmasel parameters to give us more assurance that it is working correctly.

HaikunXu commented 2 months ago

@Rick-Methot-NOAA Sure. The model in which multiple sigma_sels are estimated converged with a positive definite Hessian, so no sign it is not working correctly,

Rick-Methot-NOAA commented 2 months ago

I did some additional experimentation and find that the model's ability to estimate sigmasel is very limited. The estimated value of sigmasel is severely biased towards zero. For a long time this feature was labelled as "experimental". That label was removed without completing and testing the code. Hence, I think it best to: