Open jmcnamee opened 1 year ago
For some CAL data points, the sample size is 0. Will this be a problem?
This shouldn't be possible, but it suggests that the length data are imputed somehow. Easiest thing for us right now is to do a filter on the data frame before writing to the dat file to remove rows where there were actually no samples, but I can check what is being referenced in the composition data. Thank you for flagging this!
On Thu, May 4, 2023 at 2:06 AM Lisa Chong @.***> wrote:
For some CAL data points, the sample size is 0. Will this be a problem?
— Reply to this email directly, view it on GitHub https://github.com/gavinfay/bsb-ss-2022/issues/17#issuecomment-1534139393, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB5X65E2G5ALVFUGSPGL2ULXENBQBANCNFSM6AAAAAAXNEOKWA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
--
Gavin Fay, Ph.D. Fisheries & Ecosystem management, Ecological modeling, Stock assessment @.*** @gavin_fay www.gavinfay.com
I uploaded the sample sizes for the length comps and updated the data file but there were a few issues I encountered:
I will loop back on the other bullets, need to look in to those a bit more, but on your first and last bullets, "LIS" is the CT Long Island Sound trawl survey (a.k.a. fleets 17 and 18), so I think this knocks out two of those questions (bullets 1 and 4).
at some point during sample size calculations, the recreational discard length compositions have been removed from the processing output. @lidach any idea what happened here?
Also, I notice that many of the sample sizes for length comps seem unreasonably large (in the 1000s).
OK, noticing a few errors in the length comp processing code (I think as a result of joining sample sizes to the comps), @gavinfay fixing these.
Still struggling to find Recreational Data sample size (or rather what the column 'SampleSize' represents.
For example, the rec.ab1.len$SampleSize
are by cm. Are these number of fish sampled for each length (not likely because in some cases they are greater than the N_AB1, which seems infeasible), or are they the number of trips that were sampled that caught fish of these lengths?
If it is the latter (number of samples where fish of these lengths were measured), summing over the lengths within a year/season/area will not be the total number of trips sampled, because individual trips could (probably did) measure fish of more than 1 length - ie it is duplicating trips.
If it is the former (number of fish measured), then we still don't have the sample size information.
@jmcnamee & @lidach can you provide some illumination?
Alternatively, is there just an easy way of downloading the sample size information? (ie it doesn't have to be connected to the data, just have the sample sizes by year/season/area)
@gavinfay I'm not sure how far you got but it seemed like I made some summing mistakes for the length comps. And I'm not sure about the rec.ab1.len, can't find any information on the data Github repository
Hi @gavinfay and @lidach . On this topic (rec length sample sizes), I am communicating with Sam and I think the solution is to get the number of intercepts (closest analog to trips), it currently is number of fish sampled. I'll follow up once I get that data.
@jmcnamee @lidach Was this ever resolved?
Asking because there are a lot of garbage sample size values (mainly for discard) of less than 1. (though in some cases the data look quite good)
Additionally, there are a lot of comp data (again the discards) that have no sample size information (and the placeholder N of 25 is being applied).
The sample sizes are being calculated from variable called SampleSizeTrip
in objects rec.b2.len
& rec.ab1.len
, but there's no intuitive way I can tell of knowing what that variable is.
I am going through these because I today discovered a very large error in the commercial discard length comp calculations and wanting to check everything else.
OK, for the recreational length comp sample sizes its pretty straight forward: