StoXProject / RstoxFDA

Fisheries Dependent Analysis with Rstox
https://stoxproject.github.io/RstoxFDA
GNU Lesser General Public License v3.0
0 stars 1 forks source link

ReportFDASampling: Sampling of fisheries changes with number of grouping variables #97

Closed BergenCalling closed 10 months ago

BergenCalling commented 10 months ago

Number of vessels and catches increases with increasing number of GroupingVariables. LengthMeasurments and AgeReadings are constant

edvinf commented 10 months ago

This behaviour is expected for vessels, but not for catches.

ReportFDASampling tabulates the number of unique vessels sampled in each cell (combination of grouping variables). When more grouping variables are introduced, the same vessel may be counted several times between the different cells. E.g. The same vessel may be sampled in several Quarters, and be counted several times when Quarter is introduced as a grouping variable.

In principle, the same may apply to catches, of for instance Usage is used as a grouping variable. The same catch may be fated for both human consumption and industrial usage. But for the common grouping variables, the total sum of catches should be invariant. This happens because the current RstoxFDA implementation counts the number of unique StoxBiotic-Stations as the number of catches. It should rather count the number of hauls.

In order to make sure that there is not something more going on, it would be nice if you could confirm that your project contains some hauls that are attributed to the same station (look for warnings like this in baseline: 'There are more than one 'serialnumber' (HaulKey in StoxBioticData) for x out of y 'station'(StationKey in StoxBioticData) in the NMDBiotic data'), or that you are not using 'Usage' or any added column as grouping variable.

I consider this a bug, since catch should be identified with haul, rather than station, and will change the implementation. I will also try to clarify the documentation a bit, making explcit that unique catches and vessels are counted, and rephrase the text about SamplingVariables so that the situation with double counting does not appear to only happen when SamplingVariables are added.

BergenCalling commented 10 months ago

I confirm that my project contains some hauls that are attributed to the same station

From: Edvin Fuglebakk @.> Sent: onsdag 23. august 2023 20:31 To: StoXProject/RstoxFDA @.> Cc: Seim, Silje Elisabeth @.>; Author @.> Subject: Re: [StoXProject/RstoxFDA] ReportFDASampling: Sampling of fisheries changes with number of grouping variables (Issue #97)

This behaviour is expected for vessels, but not for catches.

ReportFDASampling tabulates the number of unique vessels sampled in each cell (combination of grouping variables). When more grouping variables are introduced, the same vessel may be counted several times between the different cells. E.g. The same vessel may be sampled in several Quarters, and be counted several times when Quarter is introduced as a grouping variable.

In principle, the same may apply to catches, of for instance Usage is used as a grouping variable. The same catch may be fated for both human consumption and industrial usage. But for the common grouping variables, the total sum of catches should be invariant. This happens because the current RstoxFDA implementation counts the number of unique StoxBiotic-Stations as the number of catches. It should rather count the number of hauls.

In order to make sure that there is not something more going on, it would be nice if you could confirm that your project contains some hauls that are attributed to the same station (look for warnings like this in baseline: 'There are more than one 'serialnumber' (HaulKey in StoxBioticData) for x out of y 'station'(StationKey in StoxBioticData) in the NMDBiotic data'), or that you are not using 'Usage' or any added column as grouping variable.

I consider this a bug, since catch should be identified with haul, rather than station, and will change the implementation. I will also try to clarify the documentation a bit, making explcit that unique catches and vessels are counted, and rephrase the text about SamplingVariables so that the situation with double counting does not appear to only happen when SamplingVariables are added.

— Reply to this email directly, view it on GitHubhttps://github.com/StoXProject/RstoxFDA/issues/97#issuecomment-1690442711, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AP3CV72FOLM5GKO56NS33ODXWZD73ANCNFSM6AAAAAA33OBLB4. You are receiving this because you authored the thread.Message ID: @.**@.>>