brianlangseth-NOAA / Spatial-Workshop-SPASAM

Spatially stratified simulation-estimation framework incorporating multistage stock-recruit relationships and incorporating larval IBM outputs
1 stars 0 forks source link

Lack of a plus group showing up in the age data #52

Closed brianlangseth-NOAA closed 1 year ago

brianlangseth-NOAA commented 1 year ago

Comment from Jeremy McKenzie during our presentation about whether there may be a data reading error in that there does not appear to be a plus group in our age comps. Something to look into

brianlangseth-NOAA commented 1 year ago

@JDeroba and @AmySchueller-NOAA I've been looking into this error and have a solution, but am not confident about it. Id appreciate your thoughts.

Our age length key is constrained for ages 1-28 and thus the probabilities that a fish at a given length bin is a certain age only exist for ages 1-28. My work around for this is to build the ALK (see issue #8) out to older ages, generate the probabilities, and then sum the probabilities of ages 28 and above to get the probability of the plus group.

My problem is that the final probability of our plus group depends on the number of ages I extend the ALK out to. Our vonB curve runs to age 28, so I assume the same length at age for ages 28+. This gives a constant probability for an age across our lengths bins for any age above 28 (see below)

image

As I extend the ages for the ALK out, our plus group gets more and more weight. The figure below shows the change when extending out the ages in the ALK by 1, 5, 10, and 20. As the ALK gets extended out the probability that a length along the x-axis gets assigned to an age 28+ grows.

image

I feel like Im missing something obvious here (like multiplying the plus group by some limit) but cant figure it out. I plan to go with expanding the ages in the ALK out 10 ages, because that is around the range of ages with probabilities for the upper lengths. Do you have another solution?

JDeroba commented 1 year ago

@brianlangseth-NOAA @AmySchueller-NOAA Brian, this took some thought and sleuthing, but I think I solved the problem. The issue is your assumption that all ages >=28 have the same length. If you allow the fish to grow even 0.5cm between pseudo-ages then the problem resolves itself. Assuming the same length for ages >=28 also seems wrong looking at the graph and the parameters provided for YFT. Unfortunately, I have not been able to replicate the variant of the vonB they provide in the user guidance (I think the equation is wrong), but we can probably use some other variant to get us close enough and allow length at age to vary more appropriately for those older ages.

brianlangseth-NOAA commented 1 year ago

My initial 1 area 7 fleet model ran in 4 hrs with my changes to the plus group. Ive named it "YFT_1area_PlusGroup_test" (for my reference only - not pushed). This differs from YFT_1area_7fleets_12 only in terms of survey_prop and catch_prop.

Comps show that indeed we are getting some plus groups, though maybe too much (see below). Will try suggestion above from @JDeroba to see if get something that isn't dominated by the plus group.

image

brianlangseth-NOAA commented 1 year ago

@JDeroba I set up the model with normal error. The issue more or less resolves itself with the exception of the 140 length bin. The probability of lengths there increase as the number of ages beyond 28 increases. The length shows the final age bin, so correspond to increasing the number of ages out 0, 1, 5, 10, and 20 ages, respectively

image

Lengths in that bin make up at most 40% of the lengths in the comp data, but average 9% among those with a positive entry

image

Altogether Im just going to pick an value to increase the age by and that value will be 10. I only add this comment here to document.

brianlangseth-NOAA commented 1 year ago

Switching to a normal error distribution around lengths resolves this issue. I added code in commit e9daf11 that adds the normal distribution (and keeps in, but comments out, the lognormal assumption lines). The aggregate comps based on the normal distribution show plus groups, where as those based on the lognormal distribution dont. Thus a switch to the normal distribution inherently adds plus groups "dynamics". Because of this, I dont plan on adding extra ages to create and populate the plus group.

Commit e9daf11 also adds a test alk.R file that provides a secondary way to expand the ages that the alk is based on, and then combine ages after the final age into a plus group. As I show above, for a normal distribution this only affects the probabilities in length bin 140cm but for lognormal it has a greater affect. This file is added here so we have a record in case we keep lognormal error.