ices-tools-dev / RDBES

The public repository of the RDBES development.
11 stars 5 forks source link

Sampling BV's across hauls within trip #21

Open KirstenBirchHaakansson opened 4 years ago

KirstenBirchHaakansson commented 4 years ago

A quite common sampling strategy for biological variables e.g. age would be to sample a specific number of fish per length classes encountered throughout the trip. This is kind of having a length stratified sampling strategy per trip e.g. filing up an ALK for a trip. This quite a common approach, which often is used at the international trawl surveys as well. Example: On the 1st hauls e.g. 5 number of fish per length class are collected for aging. On the following haul, fish are collected for length classes with <5 fish – this will continue for a length class encountered at the trip until there are 5 fish collected within trip across hauls

Two variants exist of this one; 1) The information about which hauls the individual fish are coming from are recorded and stored, so it is possible to link the individual fish to a specific haul 2) The information about which hauls the individual fish are coming from are not recorded and stored, so it is not possible to link the individual fish to a specific haul

Clear guidance should be develop for these case

We already have a FAQ for 2), but it needs to be reviewed

“Question: If in an onboard trip the lengths are collected at haul level and the ages sampled at trip level, how can this information be entered in the RDBES? Answer: In this case, the FO table should contain the rows of individual hauls (with FOaggregationLevel’=’H’) and an extra row for the trip (FOaggregationLevel’=’T’). The “H” rows will be connected to a SA row with lower hierarchy B (lengths). The “T” will be connected to a SA row with lower hierarchy C (ages). In the SA table the weights should only be present in the Haul samples”

For 1) different guidance have be given WKRDB-POP2

  1. So the first haul is in principal a TRUE case of lower hierarchy A, where weights, numbers and the design variables should be filled. The rest kind of quota sampling under hierarchy A – with a lot of BVnumberSampled = 0, BVSampled = No and BVreasonNotSampled = Quota Reached for all the lengths where there already are enough ages.
  2. Record 2 samples for each haul - 1 with lower hierarchy B, and 1 with lower hierarchy C.
KirstenBirchHaakansson commented 4 years ago

Comment from Nuno

I like the two examples you mentioned below as FAQ. They are distinct. Would prefer not to set rules on 0 weights ahead of upload, better a find and replace ahead of during estimation if we can find a proper identifier of those cases.

a couple of extra thoughts:

nmprista commented 4 years ago

Below the FAQ edit proposal based on last skype discussion.

Question: When in an onboard trip the length frequencies may be collected at haul level but the sampling goals in terms of number of ages that observers aim to collect per length-class is set at trip level. How should this information be entered in the RDBES?

Answer: The situation described is one of quota-sampling for ages at trip-level and will requires careful documentation so it can be correctly signaled to those carrying out estimation. Two variants are known to occur:

  1. The information about which hauls the individual fish are coming from is not recorded and stored, so it is not possible to link the individual fish to a specific haul
  2. The information about which hauls the individual fish are coming from is recorded and stored, so it is possible to link the individual fish to a specific haul

In the case of situation 1, the FO table should contain the rows of individual hauls (with FOaggregationLevel’=’H’) and an extra row for the trip (FOaggregationLevel’=’T’). The “H” rows will be connected to a SA row with lower hierarchy B (= only length frequency) where the length-frequencies collected in each haul can be recorded. The “T” row will be connected to a SA row with lower hierarchy C (= only bio sampling). In the SA table their will be weights in all samples that respect to the individuals sampled per haul for length frequency (hierarchy B) and in the trip for biological sampling (hierarchy C). It is important that in the SA tbale the selection method used for the biological samples (hierarchy C) is "quota sampling" so estimators know that quota sampling took place at trip level.

In the case of situation 2, the FO table should only contain the rows of individual hauls (with FOaggregationLevel’=’H’). Those “H” rows will be connected to a SA row with lower hierarchy B (= only length frequency) where the length-frequencies collected in each haul are to be recorded. With regards to ages, two things can happen. In both cases Hierarchy C is used, sampling method is set to "quota sampling", and the weights correspond to the individuals sampled: 2a. The specimens sampled were taken from the same sample as the length frequency: one subsample is placed under the length-frequency sample. 2b. The specimens sampled were taken from an additional sample other than the sample used for length frequency: one additional sample is open.

HenrikK-N commented 4 years ago

Sampling in this way is not preferred, a design-based sampling is suggested if possible. Preferred sampling schemes are described at the following link: https://academic.oup.com/icesjms/article/77/3/859/5770878 The situation described is one of quota-sampling for ages at trip-level. This type of none-probabilistic sampling require careful documentation, so it can be correctly signaled to those carrying out estimation. Two variants are known to occur:

nmprista commented 4 years ago

@HenrikK-N @KirstenBirchHaakansson I was assigned this but fo not notice any changes relative to discussed in meeting 19/08. Were there any? Is it ready to incorporate?

HenrikK-N commented 4 years ago

Kirsten raised the issue originally and was not participating the 19/8, where the core group (those present) agreed to the text. At the meeting 26/8 there was no changes but Kirsten would like to read and think the text through, before the text will be updated in the documentation. So please contact Kirsten, when both of you agree please insert the text.

KirstenBirchHaakansson commented 4 years ago

Will read it this week – just end up being side-tracked all the time

From: Henrik Kjems-Nielsen notifications@github.com Sent: 28 August 2020 08:48 To: ices-tools-dev/RDBES RDBES@noreply.github.com Cc: Kirsten Birch Håkansson kih@aqua.dtu.dk; Mention mention@noreply.github.com Subject: Re: [ices-tools-dev/RDBES] Sampling BV's across hauls within trip (#21)

Kirsten raised the issue originally and was not participating the 19/8, where the core group (those present) agreed to the text. At the meeting 26/8 there was no changes but Kirsten would like to read and think the text through, before the text will be updated in the documentation. So please contact Kirsten, when both of you agree please insert the text.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/ices-tools-dev/RDBES/issues/21#issuecomment-682359476, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AGESPOXCPZ6DPU22GPCVDBTSC5HJBANCNFSM4N2IPTCA.

KirstenBirchHaakansson commented 4 years ago

I'm ok with the text, but I think we should wait for the selectionMethod work before closing / migrating it to a FAQ.

If the solution is needed for the data call and a solution is urgent, then I think we need to revisit the issue, when the selectionMethod work is done.

KirstenBirchHaakansson commented 3 years ago

This one is already in the documentation as FAQ 20, but at the selection method meeting yesterday we agreed that we don't want any weights for these non-represenative samples. This is also what is stated in FAQ 7. Further the text needs to be updated with the new codes for selection methods

So an update of FAQ 20 is needed!

KirstenBirchHaakansson commented 2 years ago

Updated FAQ 20 - if you agree, then this issues can be closed.