Philipp-Neubauer / FirstAssessment

Analysing the time to first assessment of fish stocks
0 stars 0 forks source link

QAQC of final dataset #10

Open Philipp-Neubauer opened 7 years ago

Philipp-Neubauer commented 7 years ago

Check that:

mcmelnychuk commented 7 years ago

Phil, let me know what I can do to help with the QAQC.

mcmelnychuk commented 7 years ago

There were a few stocks that have coastwide distributions, most on the east coast and 1 on the west coast (halibut). Because of our 4-region categorization, these were assigned to the best matching region. This was previously done in some files but I just realized not in all files. These region labels "USEC" have been updated to "USEC-NE" and the single "USWC" label has been updated to "USWC-AK" in the Dropbox files.

mcmelnychuk commented 7 years ago

sorry didn't mean to close that

Philipp-Neubauer commented 7 years ago

Mike; could you check if these stocks are assigned correctly in the final dataset that goes into the analysis? I have checked WC halibut but am not sure which ones you are referring to on the EC.

I just pushed a file called dataset.csv for QAQC; lete me know if you need any other columns to check...

On Sat, Nov 12, 2016 at 1:05 PM, Michael Melnychuk <notifications@github.com

wrote:

There were a few stocks that have coastwide distributions, most on the east coast and 1 on the west coast (halibut). Because of our 4-region categorization, these were assigned to the best matching region. This was previously done in some files but I just realized not in all files. These region labels "USEC" have been updated to "USEC-NE" and the single "USWC" label has been updated to "USWC-AK" in the Dropbox files.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Philipp-Neubauer/FirstAssessment/issues/10#issuecomment-260084084, or mute the thread https://github.com/notifications/unsubscribe-auth/ACJDC5pibEcZC46HRjjbLxDhKdkE-Ticks5q9QK8gaJpZM4KhjVS .

Phil

Philipp-Neubauer commented 7 years ago

I've noticed that a number of stocks in the final table have an assessed and an un-assessed portion. For example, NE Pollock has an assessed portion in GeBank/GoMaine Atlantic pollock, but it also has an un-assessed portion with landings in Virginia and Maryland. Since the landings in these states aren't assigned to the GoMain/GB stock in the management linkage sheet, they gets assigned to a separate, un-assessed stock.

Same happens for sAtl Black Sea Bass, where landings in Florida WestCoast and Louisiana do not get assigned to the sAtl stock, but are relatively substantial (Florida).

I can keep going through to flag situations like that. Do we want to go back and check the assessments to make sure the assessment does not include these portions of the stock? In the Pollock case the landings are so small, they are probably just ignored, but for Black Sea Bass the landings are reasonably high...

On Sat, Nov 12, 2016 at 3:11 PM, Philipp Neubauer neubauer.phil@gmail.com wrote:

Mike; could you check if these stocks are assigned correctly in the final dataset that goes into the analysis? I have checked WC halibut but am not sure which ones you are referring to on the EC.

I just pushed a file called dataset.csv for QAQC; lete me know if you need any other columns to check...

On Sat, Nov 12, 2016 at 1:05 PM, Michael Melnychuk < notifications@github.com> wrote:

There were a few stocks that have coastwide distributions, most on the east coast and 1 on the west coast (halibut). Because of our 4-region categorization, these were assigned to the best matching region. This was previously done in some files but I just realized not in all files. These region labels "USEC" have been updated to "USEC-NE" and the single "USWC" label has been updated to "USWC-AK" in the Dropbox files.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Philipp-Neubauer/FirstAssessment/issues/10#issuecomment-260084084, or mute the thread https://github.com/notifications/unsubscribe-auth/ACJDC5pibEcZC46HRjjbLxDhKdkE-Ticks5q9QK8gaJpZM4KhjVS .

Phil

Phil

mcmelnychuk commented 7 years ago

hi Phil,

I can confirm that all the region designations in dataset.csv look correct to me. The region labels that I updated earlier today were in the Dropbox files: crosref.csv crossref.csv database linkage - landings management.xlsx

They were for the following stocks: USEC Atlantic menhaden USEC Atlantic surfclam USEC bluefish USEC ocean quahog USEC spiny dogfish USEC striped bass USEC tautog Pacific halibut (coastwide)

The values that I updated in those files above match the values in dataset.csv for these stocks.

Mike

On 2016-11-11 6:11 PM, Philipp Neubauer wrote:

Mike; could you check if these stocks are assigned correctly in the final dataset that goes into the analysis? I have checked WC halibut but am not sure which ones you are referring to on the EC.

I just pushed a file called dataset.csv for QAQC; lete me know if you need any other columns to check...

On Sat, Nov 12, 2016 at 1:05 PM, Michael Melnychuk <notifications@github.com

wrote:

There were a few stocks that have coastwide distributions, most on the east coast and 1 on the west coast (halibut). Because of our 4-region categorization, these were assigned to the best matching region. This was previously done in some files but I just realized not in all files. These region labels "USEC" have been updated to "USEC-NE" and the single "USWC" label has been updated to "USWC-AK" in the Dropbox files.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub

https://github.com/Philipp-Neubauer/FirstAssessment/issues/10#issuecomment-260084084, or mute the thread

https://github.com/notifications/unsubscribe-auth/ACJDC5pibEcZC46HRjjbLxDhKdkE-Ticks5q9QK8gaJpZM4KhjVS .

Phil

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/Philipp-Neubauer/FirstAssessment/issues/10#issuecomment-260094933, or mute the thread https://github.com/notifications/unsubscribe-auth/AV_oVZ-b5bTUs6E-k6ylen1qtAGRedULks5q9SBVgaJpZM4KhjVS.

mcmelnychuk commented 7 years ago

hi Phil,

That was intentional in how we mapped landings-by-state onto stock distribution areas. We (Jeannette) previously went through the assessments and identified the states that would be covered by the areas of distribution (as defined in assessments). All the states that had landings but were not already covered by an area of distribution were then considered unassessed. The pitfall in this approach is that a species could have been caught within its assessment-defined area of distribution, but landed outside of that area, so the "unassessed" landings could in fact originate from an assessed stock.

In other cases, there are 2+ stocks within a region, and although one may be assessed the other(s) may not be yet. This would likely be the case for Gulf of Mexico black sea bass.

I'm pretty confident that Jeannette would have correctly identified the states covered by the assessments. She was pretty flexible in considering the states that could reasonably be covered by the stock areas of distribution. So I'm not sure we need to review the assessments again, however: 1) that was a couple years ago already, and the most recent assessement could have a different area of distribution defined (probably not too common) 2) there could be cases where there are relatively few landings in states outside the main area of distribution (like the NE pollock example), and that those shouldn't really be considered as separate, unassessed, stocks. We could come up with some kind of a filter to exclude those, which might be a good idea to filter out unassessed "stocks" that aren't really stocks. Maybe something like: if the "unassessed" species:region landings (i.e. sum of unassessed species:state landings within a region) are less than 10% of the assessed species:region landings (i.e. sum of stocks of the same species within the region), then don't consider those as "unassessed stocks" because the landings are trivial. The 10% value is arbitrary, but the filter should exclude some like NE pollock but leave included some like black sea bass, which is probably what we want. This will increase the % of assessed stocks in Fig. 1b.

Related to this, our assistant Nicole is still going through the list of species:state entities with landings data to confirm that there are not in fact assessments for these "stocks" that we've considered as unassessed. (Ray had her working on something else the last week or so, but she'll be back on to checking this list next week).

Mike

On 2016-11-11 7:04 PM, Philipp Neubauer wrote:

I've noticed that a number of stocks in the final table have an assessed and an un-assessed portion. For example, NE Pollock has an assessed portion in GeBank/GoMaine Atlantic pollock, but it also has an un-assessed portion with landings in Virginia and Maryland. Since the landings in these states aren't assigned to the GoMain/GB stock in the management linkage sheet, they gets assigned to a separate, un-assessed stock.

Same happens for sAtl Black Sea Bass, where landings in Florida WestCoast and Louisiana do not get assigned to the sAtl stock, but are relatively substantial (Florida).

I can keep going through to flag situations like that. Do we want to go back and check the assessments to make sure the assessment does not include these portions of the stock? In the Pollock case the landings are so small, they are probably just ignored, but for Black Sea Bass the landings are reasonably high...

mcmelnychuk commented 7 years ago

In checking over dataset.csv, I noticed the scarcity of assessed stocks with "bathy-" assignments for habitat type.

Phil, I think I remember you mentioning that you used the "habitat_MM" variable instead of the "habitat_FB" variable, so I think I know what happened. In aggregating those FishBase categories, I assigned FB "bathydemersal" to MM "demersal" and FB "bathypelagic" to MM "pelagic". As you added on the habitat category "bathy-", you assigned that for unassessed stocks, but this information was lost for assessed stocks because it was hidden behind either "demersal" or "pelagic" groups in "habitat_MM". Does that make sense?

The aggregation mappings I originally used were:

demersal (MM) = demersal (FB), bathydemersal (FB)

benthopelagic (MM) = benthopelagic (FB), as well as manual re-assignment of the following stocks: GeBank/GoMaine Atlantic pollock demersal (FB) nGeBank/GoMaine silver hake demersal (FB) sGeBank/midAtl silver hake demersal (FB) USEC weakfish demersal (FB) USNE offshore hake bathydemersal (FB) GoMex vermilion snapper demersal (FB) sAtl vermilion snapper demersal (FB) USWC/BC Pacific hake pelagic-oceanic (FB)

pelagic (MM) = pelagic (FB), pelagic-neritic (FB), pelagic-oceanic (FB), as well as manual re-assignment of: GeBank/GoMaine Atlantic herring TRAC benthopelagic (FB)

reef (MM) = reef-associated (FB)

benthic (MM) = all benthic invertebrates

I'm totally fine with going with something other than the "habitat_MM" aggregations, but if you add on "bathy-" those aggregations should be reverted for bathydemersal and bathypelagic groups.

Do you want me to just update dataset.csv, or do you want to code this in somewhere?

Also, the FB categorizations I used are several years old and may have been updated since then - not sure.

Mike

Philipp-Neubauer commented 7 years ago

Thanks Mike -

yes that is almost surely what happened. Not sure what the best option is for the habitat - seeing that I pull it straight out of FB, the most straightforward fix would be to change all the references to habitat_MM to habitait_FB. We can then add the manual over-writes that you mention in our email above. Does that sound ok? It would best to do that in the code rather than the dataset itself - as that is the product of the code, and I'd have to re-run stuff anyway.

For the species:region combos, I think your suggestion of some threshold for 'minor' stocks seems good. There is another landings based threshold in the data prep script at 10t landed in at least one year. But that comes prior to assigning to stocks, so the likes of pollock slipped through (i.e., lots of pollock landed in the NE, but split among a major stock and trivial landings in the southern part of USEC-NE). I can probably just move that threshold to make it stock specific, and then set it at some number (e.g., 5t landed in some year). This may add/remove a few lines from our dataset, but it seems like a more consistent thing to do - I think keeping some very minor landings (e.g., pollock <1t) doesn't seem right.

Phil

On Sat, Nov 12, 2016 at 5:52 PM, Michael Melnychuk <notifications@github.com

wrote:

In checking over dataset.csv, I noticed the scarcity of assessed stocks with "bathy-" assignments for habitat type.

Phil, I think I remember you mentioning that you used the "habitat_MM" variable instead of the "habitat_FB" variable, so I think I know what happened. In aggregating those FishBase categories, I assigned FB "bathydemersal" to MM "demersal" and FB "bathypelagic" to MM "pelagic". As you added on the habitat category "bathy-", you assigned that for unassessed stocks, but this information was lost for assessed stocks because it was hidden behind either "demersal" or "pelagic" groups in "habitat_MM". Does that make sense?

The aggregation mappings I originally used were:

demersal (MM) = demersal (FB), bathydemersal (FB)

benthopelagic (MM) = benthopelagic (FB), as well as manual re-assignment of the following stocks: GeBank/GoMaine Atlantic pollock demersal (FB) nGeBank/GoMaine silver hake demersal (FB) sGeBank/midAtl silver hake demersal (FB) USEC weakfish demersal (FB) USNE offshore hake bathydemersal (FB) GoMex vermilion snapper demersal (FB) sAtl vermilion snapper demersal (FB) USWC/BC Pacific hake pelagic-oceanic (FB)

pelagic (MM) = pelagic (FB), pelagic-neritic (FB), pelagic-oceanic (FB), as well as manual re-assignment of: GeBank/GoMaine Atlantic herring TRAC benthopelagic (FB)

reef (MM) = reef-associated (FB)

benthic (MM) = all benthic invertebrates

I'm totally fine with going with something other than the "habitat_MM" aggregations, but if you add on "bathy-" those aggregations should be reverted for bathydemersal and bathypelagic groups.

Do you want me to just update dataset.csv, or do you want to code this in somewhere?

Also, the FB categorizations I used are several years old and may have been updated since then - not sure.

Mike

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Philipp-Neubauer/FirstAssessment/issues/10#issuecomment-260101894, or mute the thread https://github.com/notifications/unsubscribe-auth/ACJDC0gRVMqoGoU_NHOizxuBlDuNkLnqks5q9UYBgaJpZM4KhjVS .

Phil

mcmelnychuk commented 7 years ago

Yes, that sounds totally fine. It's also fine to ignore those manual overrides... that makes for a simpler explanation. The changes for east coast stocks below (from demersal to benthopelagic) were recommended by Jason Link years ago, and the changes for Pacific hake to benthopelagic and Atlantic herring to pelagic seemed reasonable at the time, but maybe for repeatability we should just go with current FishBase categories, and then from there aggregate as desired.

Sounds good to me to exclude the trivial unassessed species:state combinations that when aggregated into unassessed species:region combinations are still trivial. This would change the focus of unassessed from 'anything landed' to "could hypothetically be assessed but isn't", which seems reasonable. If you want to try a couple different thresholds, feel free to pass over the resulting lists and I can look for differences among the lists. That could help to decide on a reasonable threshold.

Mike

On 2016-11-11 10:58 PM, Philipp Neubauer wrote:

Thanks Mike -

yes that is almost surely what happened. Not sure what the best option is for the habitat - seeing that I pull it straight out of FB, the most straightforward fix would be to change all the references to habitat_MM to habitait_FB. We can then add the manual over-writes that you mention in our email above. Does that sound ok? It would best to do that in the code rather than the dataset itself - as that is the product of the code, and I'd have to re-run stuff anyway.

For the species:region combos, I think your suggestion of some threshold for 'minor' stocks seems good. There is another landings based threshold in the data prep script at 10t landed in at least one year. But that comes prior to assigning to stocks, so the likes of pollock slipped through (i.e., lots of pollock landed in the NE, but split among a major stock and trivial landings in the southern part of USEC-NE). I can probably just move that threshold to make it stock specific, and then set it at some number (e.g., 5t landed in some year). This may add/remove a few lines from our dataset, but it seems like a more consistent thing to do - I think keeping some very minor landings (e.g., pollock <1t) doesn't seem right.

Phil

On Sat, Nov 12, 2016 at 5:52 PM, Michael Melnychuk <notifications@github.com

wrote:

In checking over dataset.csv, I noticed the scarcity of assessed stocks with "bathy-" assignments for habitat type.

Phil, I think I remember you mentioning that you used the "habitat_MM" variable instead of the "habitat_FB" variable, so I think I know what happened. In aggregating those FishBase categories, I assigned FB "bathydemersal" to MM "demersal" and FB "bathypelagic" to MM "pelagic". As you added on the habitat category "bathy-", you assigned that for unassessed stocks, but this information was lost for assessed stocks because it was hidden behind either "demersal" or "pelagic" groups in "habitat_MM". Does that make sense?

The aggregation mappings I originally used were:

demersal (MM) = demersal (FB), bathydemersal (FB)

benthopelagic (MM) = benthopelagic (FB), as well as manual re-assignment of the following stocks: GeBank/GoMaine Atlantic pollock demersal (FB) nGeBank/GoMaine silver hake demersal (FB) sGeBank/midAtl silver hake demersal (FB) USEC weakfish demersal (FB) USNE offshore hake bathydemersal (FB) GoMex vermilion snapper demersal (FB) sAtl vermilion snapper demersal (FB) USWC/BC Pacific hake pelagic-oceanic (FB)

pelagic (MM) = pelagic (FB), pelagic-neritic (FB), pelagic-oceanic (FB), as well as manual re-assignment of: GeBank/GoMaine Atlantic herring TRAC benthopelagic (FB)

reef (MM) = reef-associated (FB)

benthic (MM) = all benthic invertebrates

I'm totally fine with going with something other than the "habitat_MM" aggregations, but if you add on "bathy-" those aggregations should be reverted for bathydemersal and bathypelagic groups.

Do you want me to just update dataset.csv, or do you want to code this in somewhere?

Also, the FB categorizations I used are several years old and may have been updated since then - not sure.

Mike

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub

https://github.com/Philipp-Neubauer/FirstAssessment/issues/10#issuecomment-260101894, or mute the thread

https://github.com/notifications/unsubscribe-auth/ACJDC0gRVMqoGoU_NHOizxuBlDuNkLnqks5q9UYBgaJpZM4KhjVS .

Phil

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/Philipp-Neubauer/FirstAssessment/issues/10#issuecomment-260106233, or mute the thread https://github.com/notifications/unsubscribe-auth/AV_oVaO_ll9zy-tKFP1id-jXB9PmNdA0ks5q9WOwgaJpZM4KhjVS.

mcmelnychuk commented 7 years ago

While adding to the Discussion paragraph for cephalopods, I noticed that this stock is absent from the "dataset.csv" file:

USNE longfin inshore squid Loligo pealeii (recently changed to Doryteuthis (Amerigo) pealeii)

mcmelnychuk commented 7 years ago

I did a more thorough comparison of 'dataset.csv' with the input dataset (v8) and found a few more discrepancies.

1) stocks that were in the v8 dataset (in Dropbox folder) but are not in "dataset.csv" (Github):

Stock name Region Latin name_assessment Family Year of first stock assessment GoMaine northern shrimp USEC-NE Pandalus borealis Pandalidae 1997 USNE barndoor skate USEC-NE Dipturus laevis Rajidae only relative indices USNE longfin inshore squid USEC-NE Loligo pealeii Loliginidae 1976 USNE winter skate USEC-NE Leucoraja ocellata Rajidae only relative indices USSE scalloped hammerhead shark USEC-SE Sphyrna lewini Sphyrnidae 2009 GoMex king mackerel USEC-SE-GoMex Scomberomorus cavalla Scombridae 1983 sAtl king mackerel USEC-SE-sAtl Scomberomorus cavalla Scombridae 1983 USWC black rockfish (northern) USWC-48 Sebastes melanops Sebastidae 1994 USWC greenstriped rockfish USWC-48 Sebastes elongatus Sebastidae 2009 USWC kelp greenling (OR) USWC-48 Hexagrammos decagrammus Hexagrammidae 2005 USWC longspine thornyhead USWC-48 Sebastolobus altivelis Sebastidae 1990 USWC/BC Pacific hake USWC-48 Merluccius productus Merlucciidae 1982 AI golden king crab USWC-AK Lithodes aequispina Lithodidae minimal information BSAI northern rock sole USWC-AK Lepidopsetta polyxystra Pleuronectidae 1992 GOA shortspine thornyhead USWC-AK Sebastolobus alascanus Sebastidae 1995 SE Alaska geoduck USWC-AK Panopea generosa Hiatellidae 1985 SE Alaska sea cucumber USWC-AK Parastichopus californicus Stichopodidae 1990 SE Alaska spot shrimp USWC-AK Pandalus platyceros Pandalidae 1996 Aurora rockfish - Pacific Coast USWC-48 Sebastes aurora Sebastidae 2013 Sharpchin rockfish - Pacific Coast USWC-48 Sebastes zacentrus Sebastidae 2013 Stripetail rockfish - Pacific Coast USWC-48 Sebastes saxicola Sebastidae 2013

2) one assessed stock from the v8 dataset that is in "dataset.csv" but shows as unassessed:

Stock name Region Latin name_assessment Family Year of first stock assessment USWC shortbelly rockfish USWC-48 Sebastes jordani Sebastidae 2007

In addition, there were some stocks in "dataset.csv" that should not be part of the analysis (e.g. oceanic sharks), but as long as "dataset.csv" is a pre-filtering list of stocks, this is not a problem.

I had a look and don't see any pattern hinting at why the stocks in (1) were dropped at some point, but I'm happy to help sort it out.

mcmelnychuk commented 7 years ago

note that Pacific hake is on that list... that might bring up the % assessed in the total regional landings (Fig. 1c) up to near the % of other regions if this stock was accidentally excluded.

Philipp-Neubauer commented 7 years ago

Hi Mike;

d'oh.

There are a number of issues, actually:

  1. The version dataset.csv never got updated since you last went through, so does not reflect the most recent match. I just updated the dataset.csv on github. Some issues you highlighted do not appear in that dataset - e.g., Pacific hake is in there, but USWC shortbelly rockfish is NOT in there, along with a few others on your list, the reason for that being:
  2. Most of the ones on your list are filtered out (rightly or wrongly) because it turns out the were either no landings in the landings DB prior to the first assessment year (e.g., GoMex king mackerel assessed 1983, landings from 1994) OR there were no landings immediately prior to or in the year of first assessment. The reason for the latter is, at times, just seemingly haphazard aggregating in the landings: Longfin and shortfin squid, for example, appear in 1971 and then again in 1977. In the meantime, there are only squid between 1972-1976 (the assess ment year). So we need to have some rule to say that either we do not use these stocks as we don't have any info around the period of assessments OR we keep these and use data post-assessment to fill in the gaps. I think the second option is possibly OK given that prices and landings probably do not change too much post-assessment (though if a fishery closure/rebuilding ensues, that might be a different story) BUT more generally, I am a bit worried about the landings DB not being all that complete - for quite a few of these stocks, there were landings for only a few recent years - for instance, Alaska sea-cucumber seems to have had an assessment in 1990 but no landings until 2012?. Based on that, I would be inclined to not use these stocks...

full.tab %>% filter(grepl('Alaska sea cucumber',stock)) %>% select(year,Year.of.first.stock.assessment)# A tibble: 2 × 2 year Year.of.first.stock.assessment

1 2012 1990 2 2013 1990 1. Oceanic sharks: I thought they were filtered out, but now realise that with the more recent grooming changes they do appear in the dataset - we need to can these. Seems like a simple filter on 'Shark' & 'pelagic' should do the job. Once we've agreed on how to handle No 2., I will run the model again without pelagic sharks and with or without the troublemaking stocks. Like I said, I would suggest to just leave the stocks with no data prior to the assessment out on the basis that we don't know why they were assessed. Once I've updated dataset, I will put out a csv as part of the results that shows all the stocks that are NOT included for double checking. Sorry for the confusion...not always straightforward to sensibly link the landings DB to the more researched management DB it seems... Phil On Sun, Nov 20, 2016 at 7:12 PM, Michael Melnychuk wrote: > > note that Pacific hake is on that list... that might bring up the % > assessed in the total regional landings (Fig. 1c) up to near the % of other > regions if this stock was accidentally excluded. > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > https://github.com/Philipp-Neubauer/FirstAssessment/issues/10#issuecomment-261761305, > or mute the thread > https://github.com/notifications/unsubscribe-auth/ACJDC3OGDeTbVA57Vtk2MDpPV8AqRrgeks5q_-TVgaJpZM4KhjVS > . ## Phil
mcmelnychuk commented 7 years ago

Thanks for looking into those, Phil. After a quick read, I'd certainly agree with the rule to just consider single-species stocks that are straightforward, and exclude the others even if that means excluding some with actual assessments. Even without the multiple-species issue, some of those gaps may have resulted from re-assignment of stock definitions (either merging or splitting into sub-stocks).

I'm just stepping out for the evening, but would be happy to take a look later at the new dataset.csv for a quick QAQC (if there's a reason to.)

For the shark filtering question, see the file "SpeciesCrossReference.csv" in the Dropbox folder. I had added a column there, "exclude", which lists various reasons for excluding stocks (this was incomplete and I think you added to the reasons.) One of those reasons, "HMS" identifies the oceanic sharks and tunas that can be filtered out. Later I can compare this manually-specified list against a list that would arise from filtering out "shark" & "pelagic".

Jim, do you have any thoughts about why that NMFS database wouldn't include some landings (e.g. Alaska sea cucumbers)? Just to double-check, that does include landings of all species, even those managed by state agencies? I suppose there's also the possibility that we use a newer landings database, which could solve some of the issues... though not if that would involve much more new work.

Mike

On 2016-11-20 2:25 PM, Philipp Neubauer wrote:

Hi Mike;

d'oh.

There are a number of issues, actually:

  1. The version dataset.csv never got updated since you last went through, so does not reflect the most recent match. I just updated the dataset.csv on github. Some issues you highlighted do not appear in that dataset - e.g., Pacific hake is in there, but USWC shortbelly rockfish is NOT in there, along with a few others on your list, the reason for that being:
  2. Most of the ones on your list are filtered out (rightly or wrongly) because it turns out the were either no landings in the landings DB prior to the first assessment year (e.g., GoMex king mackerel assessed 1983, landings from 1994) OR there were no landings immediately prior to or in the year of first assessment. The reason for the latter is, at times, just seemingly haphazard aggregating in the landings: Longfin and shortfin squid, for example, appear in 1971 and then again in 1977. In the meantime, there are only squid between 1972-1976 (the assess ment year). So we need to have some rule to say that either we do not use these stocks as we don't have any info around the period of assessments OR we keep these and use data post-assessment to fill in the gaps. I think the second option is possibly OK given that prices and landings probably do not change too much post-assessment (though if a fishery closure/rebuilding ensues, that might be a different story) BUT more generally, I am a bit worried about the landings DB not being all that complete - for quite a few of these stocks, there were landings for only a few recent years - for instance, Alaska sea-cucumber seems to have had an assessment in 1990 but no landings until 2012?. Based on that, I would be inclined to not use these stocks...

full.tab %>% filter(grepl('Alaska sea cucumber',stock)) %>% select(year,Year.of.first.stock.assessment)# A tibble: 2 × 2 year Year.of.first.stock.assessment

1 2012 1990 2 2013 1990 1. Oceanic sharks: I thought they were filtered out, but now realise that with the more recent grooming changes they do appear in the dataset - we need to can these. Seems like a simple filter on 'Shark' & 'pelagic' should do the job. Once we've agreed on how to handle No 2., I will run the model again without pelagic sharks and with or without the troublemaking stocks. Like I said, I would suggest to just leave the stocks with no data prior to the assessment out on the basis that we don't know why they were assessed. Once I've updated dataset, I will put out a csv as part of the results that shows all the stocks that are NOT included for double checking. Sorry for the confusion...not always straightforward to sensibly link the landings DB to the more researched management DB it seems... Phil On Sun, Nov 20, 2016 at 7:12 PM, Michael Melnychuk wrote: > > note that Pacific hake is on that list... that might bring up the % > assessed in the total regional landings (Fig. 1c) up to near the % > of other > regions if this stock was accidentally excluded. > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > > https://github.com/Philipp-Neubauer/FirstAssessment/issues/10#issuecomment-261761305, > or mute the thread > > https://github.com/notifications/unsubscribe-auth/ACJDC3OGDeTbVA57Vtk2MDpPV8AqRrgeks5q_-TVgaJpZM4KhjVS > . ## Phil — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/Philipp-Neubauer/FirstAssessment/issues/10#issuecomment-261810848, or mute the thread https://github.com/notifications/unsubscribe-auth/AV_oVe-0xG8Clzz0ujuOh9Uomegr2d8Uks5rAMjIgaJpZM4KhjVS.
Philipp-Neubauer commented 7 years ago

Cheers Mike -

I'll spit out a new dataset.csv to QAQC, along with a list of stocks that are NOT included if Jim concurs that we exclude stocks with no landings prior to the assessment date.

James-Thorson commented 7 years ago

Phil

I trust your judgement about troublesome stocks, and will support either choice. All else equal though, I think it's better to keep all study units (stocks) and have a noisy predictor variable (ie fill in missing landings based on a linear ramp between first and last landings) than dropping units. We then would have an "unbiased" measure is assessment proportion I think. But like I said, I have no strong opinion as long as list decisions in an appendix

On Nov 20, 2016 2:25 PM, "Philipp Neubauer" notifications@github.com wrote:

Hi Mike;

d'oh.

There are a number of issues, actually:

  1. The version dataset.csv never got updated since you last went through, so does not reflect the most recent match. I just updated the dataset.csv on github. Some issues you highlighted do not appear in that dataset - e.g., Pacific hake is in there, but USWC shortbelly rockfish is NOT in there, along with a few others on your list, the reason for that being:
  2. Most of the ones on your list are filtered out (rightly or wrongly) because it turns out the were either no landings in the landings DB prior to the first assessment year (e.g., GoMex king mackerel assessed 1983, landings from 1994) OR there were no landings immediately prior to or in the year of first assessment. The reason for the latter is, at times, just seemingly haphazard aggregating in the landings: Longfin and shortfin squid, for example, appear in 1971 and then again in 1977. In the meantime, there are only squid between 1972-1976 (the assess ment year). So we need to have some rule to say that either we do not use these stocks as we don't have any info around the period of assessments OR we keep these and use data post-assessment to fill in the gaps. I think the second option is possibly OK given that prices and landings probably do not change too much post-assessment (though if a fishery closure/rebuilding ensues, that might be a different story) BUT more generally, I am a bit worried about the landings DB not being all that complete - for quite a few of these stocks, there were landings for only a few recent years - for instance, Alaska sea-cucumber seems to have had an assessment in 1990 but no landings until 2012?. Based on that, I would be inclined to not use these stocks...

full.tab %>% filter(grepl('Alaska sea cucumber',stock)) %>% select(year,Year.of.first.stock.assessment)# A tibble: 2 × 2 year Year.of.first.stock.assessment

1 2012 1990 2 2013 1990 1. Oceanic sharks: I thought they were filtered out, but now realise that with the more recent grooming changes they do appear in the dataset - we need to can these. Seems like a simple filter on 'Shark' & 'pelagic' should do the job. Once we've agreed on how to handle No 2., I will run the model again without pelagic sharks and with or without the troublemaking stocks. Like I said, I would suggest to just leave the stocks with no data prior to the assessment out on the basis that we don't know why they were assessed. Once I've updated dataset, I will put out a csv as part of the results that shows all the stocks that are NOT included for double checking. Sorry for the confusion...not always straightforward to sensibly link the landings DB to the more researched management DB it seems... Phil On Sun, Nov 20, 2016 at 7:12 PM, Michael Melnychuk wrote: > > note that Pacific hake is on that list... that might bring up the % > assessed in the total regional landings (Fig. 1c) up to near the % of > other > regions if this stock was accidentally excluded. > > — > You are receiving this because you authored the thread. > > Reply to this email directly, view it on GitHub > issues/10#issuecomment-261761305>, > or mute the thread > ACJDC3OGDeTbVA57Vtk2MDpPV8AqRrgeks5q_-TVgaJpZM4KhjVS> > . ## Phil — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Philipp-Neubauer/FirstAssessment/issues/10#issuecomment-261810848, or mute the thread https://github.com/notifications/unsubscribe-auth/AHnqTZFaCUzg4Uximfluc4xUVHE1oJaDks5rAMjIgaJpZM4KhjVS .
Philipp-Neubauer commented 7 years ago

Cool, thanks, Jim.

I agree its best to keep all stocks, but for a lot, we don't have any data prior to the assessment (e.g., 10 years from assessment to first data!), so the ramp would have to start at an arbitrary place (though we could choose to start it in 1950 for all such stocks).

For the assessed proportions, that number is calculated prior to filtering out stocks with no data pre-assessment, so wouldn't be biased - i.e., the filter is only applied to let us run models with a consistent set of parameters.

Phil

On Mon, Nov 21, 2016 at 12:52 PM, Jim Thorson notifications@github.com wrote:

Phil

I trust your judgement about troublesome stocks, and will support either choice. All else equal though, I think it's better to keep all study units (stocks) and have a noisy predictor variable (ie fill in missing landings based on a linear ramp between first and last landings) than dropping units. We then would have an "unbiased" measure is assessment proportion I think. But like I said, I have no strong opinion as long as list decisions in an appendix

On Nov 20, 2016 2:25 PM, "Philipp Neubauer" notifications@github.com

wrote:

Hi Mike;

d'oh.

There are a number of issues, actually:

  1. The version dataset.csv never got updated since you last went through, so does not reflect the most recent match. I just updated the dataset.csv on github. Some issues you highlighted do not appear in that dataset - e.g., Pacific hake is in there, but USWC shortbelly rockfish is NOT in there, along with a few others on your list, the reason for that being:
  2. Most of the ones on your list are filtered out (rightly or wrongly) because it turns out the were either no landings in the landings DB prior to the first assessment year (e.g., GoMex king mackerel assessed 1983, landings from 1994) OR there were no landings immediately prior to or in the year of first assessment. The reason for the latter is, at times, just seemingly haphazard aggregating in the landings: Longfin and shortfin squid, for example, appear in 1971 and then again in 1977. In the meantime, there are only squid between 1972-1976 (the assess ment year). So we need to have some rule to say that either we do not use these stocks as we don't have any info around the period of assessments OR we keep these and use data post-assessment to fill in the gaps. I think the second option is possibly OK given that prices and landings probably do not change too much post-assessment (though if a fishery closure/rebuilding ensues, that might be a different story) BUT more generally, I am a bit worried about the landings DB not being all that complete - for quite a few of these stocks, there were landings for only a few recent years - for instance, Alaska sea-cucumber seems to have had an assessment in 1990 but no landings until 2012?. Based on that, I would be inclined to not use these stocks...

full.tab %>% filter(grepl('Alaska sea cucumber',stock)) %>% select(year,Year.of.first.stock.assessment)# A tibble: 2 × 2 year Year.of.first.stock.assessment

1 2012 1990 2 2013 1990 1. Oceanic sharks: I thought they were filtered out, but now realise that with the more recent grooming changes they do appear in the dataset - we need to can these. Seems like a simple filter on 'Shark' & 'pelagic' should do the job. Once we've agreed on how to handle No 2., I will run the model again without pelagic sharks and with or without the troublemaking stocks. Like I said, I would suggest to just leave the stocks with no data prior to the assessment out on the basis that we don't know why they were assessed. Once I've updated dataset, I will put out a csv as part of the results that shows all the stocks that are NOT included for double checking. Sorry for the confusion...not always straightforward to sensibly link the landings DB to the more researched management DB it seems... Phil On Sun, Nov 20, 2016 at 7:12 PM, Michael Melnychuk < notifications@github.com > wrote: > > note that Pacific hake is on that list... that might bring up the % > assessed in the total regional landings (Fig. 1c) up to near the % of > other > regions if this stock was accidentally excluded. > > — > You are receiving this because you authored the thread. > > Reply to this email directly, view it on GitHub > issues/10#issuecomment-261761305>, > or mute the thread > ACJDC3OGDeTbVA57Vtk2MDpPV8AqRrgeks5q_-TVgaJpZM4KhjVS> > . ## Phil — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub , or mute the thread . — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Philipp-Neubauer/FirstAssessment/issues/10#issuecomment-261816133, or mute the thread https://github.com/notifications/unsubscribe-auth/ACJDC5UMpdtzND4yh6OWDpIOGzwUawz_ks5rAN05gaJpZM4KhjVS .

Phil

James-Thorson commented 7 years ago

Ok what about stocks where there's some data before and after we ramp, and otherwise we drop? Or does that seem silly?

On Nov 20, 2016 4:01 PM, "Philipp Neubauer" notifications@github.com wrote:

Cool, thanks, Jim.

I agree its best to keep all stocks, but for a lot, we don't have any data prior to the assessment (e.g., 10 years from assessment to first data!), so the ramp would have to start at an arbitrary place (though we could choose to start it in 1950 for all such stocks).

For the assessed proportions, that number is calculated prior to filtering out stocks with no data pre-assessment, so wouldn't be biased - i.e., the filter is only applied to let us run models with a consistent set of parameters.

Phil

On Mon, Nov 21, 2016 at 12:52 PM, Jim Thorson notifications@github.com wrote:

Phil

I trust your judgement about troublesome stocks, and will support either choice. All else equal though, I think it's better to keep all study units (stocks) and have a noisy predictor variable (ie fill in missing landings based on a linear ramp between first and last landings) than dropping units. We then would have an "unbiased" measure is assessment proportion I think. But like I said, I have no strong opinion as long as list decisions in an appendix

On Nov 20, 2016 2:25 PM, "Philipp Neubauer" notifications@github.com

wrote:

Hi Mike;

d'oh.

There are a number of issues, actually:

  1. The version dataset.csv never got updated since you last went through, so does not reflect the most recent match. I just updated the dataset.csv on github. Some issues you highlighted do not appear in that dataset - e.g., Pacific hake is in there, but USWC shortbelly rockfish is NOT in there, along with a few others on your list, the reason for that being:
  2. Most of the ones on your list are filtered out (rightly or wrongly) because it turns out the were either no landings in the landings DB prior to the first assessment year (e.g., GoMex king mackerel assessed 1983, landings from 1994) OR there were no landings immediately prior to or in the year of first assessment. The reason for the latter is, at times, just seemingly haphazard aggregating in the landings: Longfin and shortfin squid, for example, appear in 1971 and then again in 1977. In the meantime, there are only squid between 1972-1976 (the assess ment year). So we need to have some rule to say that either we do not use these stocks as we don't have any info around the period of assessments OR we keep these and use data post-assessment to fill in the gaps. I think the second option is possibly OK given that prices and landings probably do not change too much post-assessment (though if a fishery closure/rebuilding ensues, that might be a different story) BUT more generally, I am a bit worried about the landings DB not being all that complete - for quite a few of these stocks, there were landings for only a few recent years - for instance, Alaska sea-cucumber seems to have had an assessment in 1990 but no landings until 2012?. Based on that, I would be inclined to not use these stocks...

full.tab %>% filter(grepl('Alaska sea cucumber',stock)) %>% select(year,Year.of.first.stock.assessment)# A tibble: 2 × 2 year Year.of.first.stock.assessment

1 2012 1990 2 2013 1990 1. Oceanic sharks: I thought they were filtered out, but now realise that with the more recent grooming changes they do appear in the dataset - we need to can these. Seems like a simple filter on 'Shark' & 'pelagic' should do the job. Once we've agreed on how to handle No 2., I will run the model again without pelagic sharks and with or without the troublemaking stocks. Like I said, I would suggest to just leave the stocks with no data prior to the assessment out on the basis that we don't know why they were assessed. Once I've updated dataset, I will put out a csv as part of the results that shows all the stocks that are NOT included for double checking. Sorry for the confusion...not always straightforward to sensibly link the landings DB to the more researched management DB it seems... Phil On Sun, Nov 20, 2016 at 7:12 PM, Michael Melnychuk < notifications@github.com > wrote: > > note that Pacific hake is on that list... that might bring up the % > assessed in the total regional landings (Fig. 1c) up to near the % of > other > regions if this stock was accidentally excluded. > > — > You are receiving this because you authored the thread. > > Reply to this email directly, view it on GitHub > issues/10#issuecomment-261761305>, > or mute the thread > ACJDC3OGDeTbVA57Vtk2MDpPV8AqRrgeks5q_-TVgaJpZM4KhjVS> > . ## Phil — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub , or mute the thread . — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub , or mute the thread .

Phil

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Philipp-Neubauer/FirstAssessment/issues/10#issuecomment-261816676, or mute the thread https://github.com/notifications/unsubscribe-auth/AHnqTd_tq_xuV56CFMfU0JyjTyN0HKY1ks5rAN9xgaJpZM4KhjVS .

Philipp-Neubauer commented 7 years ago

Think that's fine.

On Mon, Nov 21, 2016 at 1:07 PM, Jim Thorson notifications@github.com wrote:

Ok what about stocks where there's some data before and after we ramp, and otherwise we drop? Or does that seem silly?

On Nov 20, 2016 4:01 PM, "Philipp Neubauer" notifications@github.com

wrote:

Cool, thanks, Jim.

I agree its best to keep all stocks, but for a lot, we don't have any data prior to the assessment (e.g., 10 years from assessment to first data!), so the ramp would have to start at an arbitrary place (though we could choose to start it in 1950 for all such stocks).

For the assessed proportions, that number is calculated prior to filtering out stocks with no data pre-assessment, so wouldn't be biased - i.e., the filter is only applied to let us run models with a consistent set of parameters.

Phil

On Mon, Nov 21, 2016 at 12:52 PM, Jim Thorson notifications@github.com wrote:

Phil

I trust your judgement about troublesome stocks, and will support either choice. All else equal though, I think it's better to keep all study units (stocks) and have a noisy predictor variable (ie fill in missing landings based on a linear ramp between first and last landings) than dropping units. We then would have an "unbiased" measure is assessment proportion I think. But like I said, I have no strong opinion as long as list decisions in an appendix

On Nov 20, 2016 2:25 PM, "Philipp Neubauer" notifications@github.com

wrote:

Hi Mike;

d'oh.

There are a number of issues, actually:

  1. The version dataset.csv never got updated since you last went through, so does not reflect the most recent match. I just updated the dataset.csv on github. Some issues you highlighted do not appear in that dataset - e.g., Pacific hake is in there, but USWC shortbelly rockfish is NOT in there, along with a few others on your list, the reason for that being:
  2. Most of the ones on your list are filtered out (rightly or wrongly) because it turns out the were either no landings in the landings DB prior to the first assessment year (e.g., GoMex king mackerel assessed 1983, landings from 1994) OR there were no landings immediately prior to or in the year of first assessment. The reason for the latter is, at times, just seemingly haphazard aggregating in the landings: Longfin and shortfin squid, for example, appear in 1971 and then again in 1977. In the meantime, there are only squid between 1972-1976 (the assess ment year). So we need to have some rule to say that either we do not use these stocks as we don't have any info around the period of assessments OR we keep these and use data post-assessment to fill in the gaps. I think the second option is possibly OK given that prices and landings probably do not change too much post-assessment (though if a fishery closure/rebuilding ensues, that might be a different story) BUT more generally, I am a bit worried about the landings DB not being all that complete - for quite a few of these stocks, there were landings for only a few recent years - for instance, Alaska sea-cucumber seems to have had an assessment in 1990 but no landings until 2012?. Based on that, I would be inclined to not use these stocks...

full.tab %>% filter(grepl('Alaska sea cucumber',stock)) %>% select(year,Year.of.first.stock.assessment)# A tibble: 2 × 2 year Year.of.first.stock.assessment

1 2012 1990 2 2013 1990 1. Oceanic sharks: I thought they were filtered out, but now realise that with the more recent grooming changes they do appear in the dataset - we need to can these. Seems like a simple filter on 'Shark' & 'pelagic' should do the job. Once we've agreed on how to handle No 2., I will run the model again without pelagic sharks and with or without the troublemaking stocks. Like I said, I would suggest to just leave the stocks with no data prior to the assessment out on the basis that we don't know why they were assessed. Once I've updated dataset, I will put out a csv as part of the results that shows all the stocks that are NOT included for double checking. Sorry for the confusion...not always straightforward to sensibly link the landings DB to the more researched management DB it seems... Phil On Sun, Nov 20, 2016 at 7:12 PM, Michael Melnychuk < notifications@github.com > wrote: > > note that Pacific hake is on that list... that might bring up the % > assessed in the total regional landings (Fig. 1c) up to near the % of > other > regions if this stock was accidentally excluded. > > — > You are receiving this because you authored the thread. > > Reply to this email directly, view it on GitHub > issues/10#issuecomment-261761305>, > or mute the thread > ACJDC3OGDeTbVA57Vtk2MDpPV8AqRrgeks5q_-TVgaJpZM4KhjVS> > . ## Phil — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub , or mute the thread . — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub , or mute the thread .

Phil

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Philipp-Neubauer/FirstAssessment/ issues/10#issuecomment-261816676, or mute the thread https://github.com/notifications/unsubscribe-auth/AHnqTd_tq_ xuV56CFMfU0JyjTyN0HKY1ks5rAN9xgaJpZM4KhjVS

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Philipp-Neubauer/FirstAssessment/issues/10#issuecomment-261817029, or mute the thread https://github.com/notifications/unsubscribe-auth/ACJDC0HEY2sNhQ3_hgR8dyIDLuPaGkbYks5rAODXgaJpZM4KhjVS .

Phil

mcmelnychuk commented 7 years ago

yep - sounds good.

Mike

On 2016-11-20 4:09 PM, Philipp Neubauer wrote:

Think that's fine.

On Mon, Nov 21, 2016 at 1:07 PM, Jim Thorson notifications@github.com wrote:

Ok what about stocks where there's some data before and after we ramp, and otherwise we drop? Or does that seem silly?

On Nov 20, 2016 4:01 PM, "Philipp Neubauer" notifications@github.com

wrote:

Cool, thanks, Jim.

I agree its best to keep all stocks, but for a lot, we don't have any data prior to the assessment (e.g., 10 years from assessment to first data!), so the ramp would have to start at an arbitrary place (though we could choose to start it in 1950 for all such stocks).

For the assessed proportions, that number is calculated prior to filtering out stocks with no data pre-assessment, so wouldn't be biased - i.e., the filter is only applied to let us run models with a consistent set of parameters.

Phil

On Mon, Nov 21, 2016 at 12:52 PM, Jim Thorson notifications@github.com wrote:

Phil

I trust your judgement about troublesome stocks, and will support either choice. All else equal though, I think it's better to keep all study units (stocks) and have a noisy predictor variable (ie fill in missing landings based on a linear ramp between first and last landings) than dropping units. We then would have an "unbiased" measure is assessment proportion I think. But like I said, I have no strong opinion as long as list decisions in an appendix

On Nov 20, 2016 2:25 PM, "Philipp Neubauer" notifications@github.com

wrote:

Hi Mike;

d'oh.

There are a number of issues, actually:

  1. The version dataset.csv never got updated since you last went through, so does not reflect the most recent match. I just updated the dataset.csv on github. Some issues you highlighted do not appear in that dataset - e.g., Pacific hake is in there, but USWC shortbelly rockfish is NOT in there, along with a few others on your list, the reason for that being:
  2. Most of the ones on your list are filtered out (rightly or wrongly) because it turns out the were either no landings in the landings DB prior to the first assessment year (e.g., GoMex king mackerel assessed 1983, landings from 1994) OR there were no landings immediately prior to or in the year of first assessment. The reason for the latter is, at times, just seemingly haphazard aggregating in the landings: Longfin and shortfin squid, for example, appear in 1971 and then again in 1977. In the meantime, there are only squid between 1972-1976 (the assess ment year). So we need to have some rule to say that either we do not use these stocks as we don't have any info around the period of assessments OR we keep these and use data post-assessment to fill in the gaps. I think the second option is possibly OK given that prices and landings probably do not change too much post-assessment (though if a fishery closure/rebuilding ensues, that might be a different story) BUT more generally, I am a bit worried about the landings DB not being all that complete - for quite a few of these stocks, there were landings for only a few recent years - for instance, Alaska sea-cucumber seems to have had an assessment in 1990 but no landings until 2012?. Based on that, I would be inclined to not use these stocks...

full.tab %>% filter(grepl('Alaska sea cucumber',stock)) %>% select(year,Year.of.first.stock.assessment)# A tibble: 2 × 2 year Year.of.first.stock.assessment

1 2012 1990 2 2013 1990 1. Oceanic sharks: I thought they were filtered out, but now realise that with the more recent grooming changes they do appear in the dataset - we need to can these. Seems like a simple filter on 'Shark' & 'pelagic' should do the job. Once we've agreed on how to handle No 2., I will run the model again without pelagic sharks and with or without the troublemaking stocks. Like I said, I would suggest to just leave the stocks with no data prior to the assessment out on the basis that we don't know why they were assessed. Once I've updated dataset, I will put out a csv as part of the results that shows all the stocks that are NOT included for double checking. Sorry for the confusion...not always straightforward to sensibly link the landings DB to the more researched management DB it seems... Phil On Sun, Nov 20, 2016 at 7:12 PM, Michael Melnychuk < notifications@github.com > wrote: > > note that Pacific hake is on that list... that might bring up > the % > assessed in the total regional landings (Fig. 1c) up to near > the % of > other > regions if this stock was accidentally excluded. > > — > You are receiving this because you authored the thread. > > Reply to this email directly, view it on GitHub > issues/10#issuecomment-261761305>, > or mute the thread > ACJDC3OGDeTbVA57Vtk2MDpPV8AqRrgeks5q_-TVgaJpZM4KhjVS> > . ## Phil — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub , or mute the thread . — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub , or mute the thread .

Phil

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Philipp-Neubauer/FirstAssessment/ issues/10#issuecomment-261816676, or mute the thread https://github.com/notifications/unsubscribe-auth/AHnqTd_tq_ xuV56CFMfU0JyjTyN0HKY1ks5rAN9xgaJpZM4KhjVS

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub

https://github.com/Philipp-Neubauer/FirstAssessment/issues/10#issuecomment-261817029, or mute the thread

https://github.com/notifications/unsubscribe-auth/ACJDC0HEY2sNhQ3_hgR8dyIDLuPaGkbYks5rAODXgaJpZM4KhjVS .

Phil

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/Philipp-Neubauer/FirstAssessment/issues/10#issuecomment-261817093, or mute the thread https://github.com/notifications/unsubscribe-auth/AV_oVRPiW_Cfm57RI0z7195dJrK_okFeks5rAOEdgaJpZM4KhjVS.

mcmelnychuk commented 7 years ago

hi Phil,

I've gone through the 'dataset.csv' and 'dataset_missing.csv' files posted today. I can confirm that all stocks in 'dataset.csv' should indeed be there, and that the only ones missing are those in 'dataset_missing.csv". There are a couple of minor issues we could consider:

1) Of the 17 stocks in "dataset_missing.csv", I think that 2 or 3 of them could actually be included in the analysis. Cheasapeake Blue crab may have been excluded because it has 4 entities in the landings database, unlike other stocks. Not sure why Oregon kelp greenling was excluded. Alaska geoduck was excluded based on the rules that its only pre-assessment landings were very low, but this is actually what occurred in the fishery. I've added some notes in the attached file and highlighted these 3 stocks.

dataset_missed csv 2016-11-21.xlsx

2) In "dataset.csv", there are a couple of regional assignment issues:

(a) The following species have a coastwide distribution, and are assigned to "USEC-NE" instead of "USEC-SE" on the basis of their landings: USEC Atlantic croaker USEC bluefish USEC striped bass USEC weakfish These stocks each appear twice in the list, once for the NE and once for the SE. The NE entries are correct, but the four Region=USEC-SE rows should be deleted.

(b) the reverse: Sandbar shark Atlantic has a coastwide distribution, with most landings in the SE rather than NE. Its entry with region "USEC-SE" is correct but the entry with Region="USEC-NE" should be deleted.

(c) USNE scup also has a coastwide distribution, with most landings in the NE rather than SE. Its region should be "USEC-NE" rather than "USEC-SE".

Philipp-Neubauer commented 7 years ago

Thanks Mike -

I managed to get all these in and fixed up now. All these glitches were due to grooming rules gone wrong, in various places - removing "Kelp" removed kelp greenling, having a rule for some level of landings (which I thought was conservative at 0.1) led to the exclusion of Geoducks. And a non-standard character in place of a space axes the blue crab...took a while to track those down, but they're all in now. Have fixed the regional landings, too (that was due to a grooming rule gone bad, too, as I thought those were dealt with, too.) - but some did have higher landings in other regions prior to assessment (weakfish).

On Tue, Nov 22, 2016 at 11:35 AM, Michael Melnychuk < notifications@github.com> wrote:

hi Phil,

I've gone through the 'dataset.csv' and 'dataset_missing.csv' files posted today. I can confirm that all stocks in 'dataset.csv' should indeed be there, and that the only ones missing are those in 'dataset_missing.csv". There are a couple of minor issues we could consider:

  1. Of the 17 stocks in "dataset_missing.csv", I think that 2 or 3 of them could actually be included in the analysis. Cheasapeake Blue crab may have been excluded because it has 4 entities in the landings database, unlike other stocks. Not sure why Oregon kelp greenling was excluded. Alaska geoduck was excluded based on the rules that its only pre-assessment landings were very low, but this is actually what occurred in the fishery. I've added some notes in the attached file and highlighted these 3 stocks.

dataset_missed csv 2016-11-21.xlsx https://github.com/Philipp-Neubauer/FirstAssessment/files/604915/dataset_missed.csv.2016-11-21.xlsx

  1. In "dataset.csv", there are a couple of regional assignment issues:

(a) The following species have a coastwide distribution, and are assigned to "USEC-NE" instead of "USEC-SE" on the basis of their landings: USEC Atlantic croaker USEC bluefish USEC striped bass USEC weakfish These stocks each appear twice in the list, once for the NE and once for the SE. The NE entries are correct, but the four Region=USEC-SE rows should be deleted.

(b) the reverse: Sandbar shark Atlantic has a coastwide distribution, with most landings in the SE rather than NE. Its entry with region "USEC-SE" is correct but the entry with Region="USEC-NE" should be deleted.

(c) USNE scup also has a coastwide distribution, with most landings in the NE rather than SE. Its region should be "USEC-NE" rather than "USEC-SE".

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Philipp-Neubauer/FirstAssessment/issues/10#issuecomment-262089363, or mute the thread https://github.com/notifications/unsubscribe-auth/ACJDC5GBSqxao8nKMdRjroG-efcMVWLwks5rAhytgaJpZM4KhjVS .

Phil

mcmelnychuk commented 7 years ago

Sounds good - thanks for looking into those and sorting those out.

Do you use still read the "exclude" column within the SpeciesCrossReference.csv file in the Dropbox folder when running the code from scratch? Depending on the results of Nicole's searches (hopefully done in the next day or two), there may be a few changes to make with stock inclusions or exclusions. I can modify that file directly if it's still used, or else incorporate the changes in other ways if it's not.

For weakfish specifically, can that one be kept as assigned to NE instead of to SE? It's center of distribution is from New York to North Carolina, which aligns better with NE. The "automatic classification" ending up as SE resulted from having to class North Carolina in one region or the other for the landings-by-state default regional assignments; NC is right in the middle of the typical north/south split, and it constitutes most of the weakfish landings. Most of the time (for other stocks) it made sense to consider NC part of the SE region, which is why those state landings are assigned to SE by default. In the case for weakfish, assigning it to NE actually makes more sense. It's a coastwide assessment, so the whole stock could be assigned to NE rather than splitting it into 2 stocks.

Mike

On 2016-11-22 1:44 AM, Philipp Neubauer wrote:

Thanks Mike -

I managed to get all these in and fixed up now. All these glitches were due to grooming rules gone wrong, in various places - removing "Kelp" removed kelp greenling, having a rule for some level of landings (which I thought was conservative at 0.1) led to the exclusion of Geoducks. And a non-standard character in place of a space axes the blue crab...took a while to track those down, but they're all in now. Have fixed the regional landings, too (that was due to a grooming rule gone bad, too, as I thought those were dealt with, too.) - but some did have higher landings in other regions prior to assessment (weakfish).

On Tue, Nov 22, 2016 at 11:35 AM, Michael Melnychuk < notifications@github.com> wrote:

hi Phil,

I've gone through the 'dataset.csv' and 'dataset_missing.csv' files posted today. I can confirm that all stocks in 'dataset.csv' should indeed be there, and that the only ones missing are those in 'dataset_missing.csv". There are a couple of minor issues we could consider:

  1. Of the 17 stocks in "dataset_missing.csv", I think that 2 or 3 of them could actually be included in the analysis. Cheasapeake Blue crab may have been excluded because it has 4 entities in the landings database, unlike other stocks. Not sure why Oregon kelp greenling was excluded. Alaska geoduck was excluded based on the rules that its only pre-assessment landings were very low, but this is actually what occurred in the fishery. I've added some notes in the attached file and highlighted these 3 stocks.

dataset_missed csv 2016-11-21.xlsx

https://github.com/Philipp-Neubauer/FirstAssessment/files/604915/dataset_missed.csv.2016-11-21.xlsx

  1. In "dataset.csv", there are a couple of regional assignment issues:

(a) The following species have a coastwide distribution, and are assigned to "USEC-NE" instead of "USEC-SE" on the basis of their landings: USEC Atlantic croaker USEC bluefish USEC striped bass USEC weakfish These stocks each appear twice in the list, once for the NE and once for the SE. The NE entries are correct, but the four Region=USEC-SE rows should be deleted.

(b) the reverse: Sandbar shark Atlantic has a coastwide distribution, with most landings in the SE rather than NE. Its entry with region "USEC-SE" is correct but the entry with Region="USEC-NE" should be deleted.

(c) USNE scup also has a coastwide distribution, with most landings in the NE rather than SE. Its region should be "USEC-NE" rather than "USEC-SE".

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub

https://github.com/Philipp-Neubauer/FirstAssessment/issues/10#issuecomment-262089363, or mute the thread

https://github.com/notifications/unsubscribe-auth/ACJDC5GBSqxao8nKMdRjroG-efcMVWLwks5rAhytgaJpZM4KhjVS .

Phil

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/Philipp-Neubauer/FirstAssessment/issues/10#issuecomment-262193472, or mute the thread https://github.com/notifications/unsubscribe-auth/AV_oVbhsNf5AX1yYgoS_r3DXspYNg2-wks5rArlpgaJpZM4KhjVS.

Philipp-Neubauer commented 7 years ago

Yes -

have kept weakfish in the NE assignment (by manual over-write).

I do use the exclude column now as a default, so, yes, adding to that column would be the most straightforward way to exclude other stocks...

On Wed, Nov 23, 2016 at 7:16 AM, Michael Melnychuk <notifications@github.com

wrote:

Sounds good - thanks for looking into those and sorting those out.

Do you use still read the "exclude" column within the SpeciesCrossReference.csv file in the Dropbox folder when running the code from scratch? Depending on the results of Nicole's searches (hopefully done in the next day or two), there may be a few changes to make with stock inclusions or exclusions. I can modify that file directly if it's still used, or else incorporate the changes in other ways if it's not.

For weakfish specifically, can that one be kept as assigned to NE instead of to SE? It's center of distribution is from New York to North Carolina, which aligns better with NE. The "automatic classification" ending up as SE resulted from having to class North Carolina in one region or the other for the landings-by-state default regional assignments; NC is right in the middle of the typical north/south split, and it constitutes most of the weakfish landings. Most of the time (for other stocks) it made sense to consider NC part of the SE region, which is why those state landings are assigned to SE by default. In the case for weakfish, assigning it to NE actually makes more sense. It's a coastwide assessment, so the whole stock could be assigned to NE rather than splitting it into 2 stocks.

Mike

On 2016-11-22 1:44 AM, Philipp Neubauer wrote:

Thanks Mike -

I managed to get all these in and fixed up now. All these glitches were due to grooming rules gone wrong, in various places - removing "Kelp" removed kelp greenling, having a rule for some level of landings (which I thought was conservative at 0.1) led to the exclusion of Geoducks. And a non-standard character in place of a space axes the blue crab...took a while to track those down, but they're all in now. Have fixed the regional landings, too (that was due to a grooming rule gone bad, too, as I thought those were dealt with, too.) - but some did have higher landings in other regions prior to assessment (weakfish).

On Tue, Nov 22, 2016 at 11:35 AM, Michael Melnychuk < notifications@github.com> wrote:

hi Phil,

I've gone through the 'dataset.csv' and 'dataset_missing.csv' files posted today. I can confirm that all stocks in 'dataset.csv' should indeed be there, and that the only ones missing are those in 'dataset_missing.csv". There are a couple of minor issues we could consider:

  1. Of the 17 stocks in "dataset_missing.csv", I think that 2 or 3 of them could actually be included in the analysis. Cheasapeake Blue crab may have been excluded because it has 4 entities in the landings database, unlike other stocks. Not sure why Oregon kelp greenling was excluded. Alaska geoduck was excluded based on the rules that its only pre-assessment landings were very low, but this is actually what occurred in the fishery. I've added some notes in the attached file and highlighted these 3 stocks.

dataset_missed csv 2016-11-21.xlsx

https://github.com/Philipp-Neubauer/FirstAssessment/ files/604915/dataset_missed.csv.2016-11-21.xlsx

  1. In "dataset.csv", there are a couple of regional assignment issues:

(a) The following species have a coastwide distribution, and are assigned to "USEC-NE" instead of "USEC-SE" on the basis of their landings: USEC Atlantic croaker USEC bluefish USEC striped bass USEC weakfish These stocks each appear twice in the list, once for the NE and once for the SE. The NE entries are correct, but the four Region=USEC-SE rows should be deleted.

(b) the reverse: Sandbar shark Atlantic has a coastwide distribution, with most landings in the SE rather than NE. Its entry with region "USEC-SE" is correct but the entry with Region="USEC-NE" should be deleted.

(c) USNE scup also has a coastwide distribution, with most landings in the NE rather than SE. Its region should be "USEC-NE" rather than "USEC-SE".

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub

https://github.com/Philipp-Neubauer/FirstAssessment/ issues/10#issuecomment-262089363, or mute the thread

https://github.com/notifications/unsubscribe- auth/ACJDC5GBSqxao8nKMdRjroG-efcMVWLwks5rAhytgaJpZM4KhjVS .

Phil

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/Philipp-Neubauer/FirstAssessment/ issues/10#issuecomment-262193472, or mute the thread https://github.com/notifications/unsubscribe-auth/AV_oVbhsNf5AX1yYgoS_ r3DXspYNg2-wks5rArlpgaJpZM4KhjVS.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Philipp-Neubauer/FirstAssessment/issues/10#issuecomment-262320667, or mute the thread https://github.com/notifications/unsubscribe-auth/ACJDC1eoncWgkWzOt8RlFURYgtVPdzLtks5rAzFqgaJpZM4KhjVS .

Phil

mcmelnychuk commented 7 years ago

just an update - the search for other US assessments not previously covered is taking longer than expected. Nicole anticipates being done on Friday now. She has found several so far (~10) that were not previously on our list.

Philipp-Neubauer commented 7 years ago

Thanks Mike, be interesting to see how much of a difference it makes. Important to get this one right, too, as any assessment scientist will question the study if her/his stock is not on the list...

On Thu, Dec 1, 2016 at 9:52 AM, Michael Melnychuk notifications@github.com wrote:

just an update - the search for other US assessments not previously covered is taking longer than expected. Nicole anticipates being done on Friday now. She has found several so far (~10) that were not previously on our list.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Philipp-Neubauer/FirstAssessment/issues/10#issuecomment-263991613, or mute the thread https://github.com/notifications/unsubscribe-auth/ACJDC0fnoujXes6pNLmoNLr0tQ7Hn_Dtks5rDeHygaJpZM4KhjVS .

-- Phil