MargaretSiple-NOAA / goa-ai-data-reports

Automate data reports for GOA and AI surveys
1 stars 1 forks source link

Produce a species list with flags to facilitate species checks ✔❌ #11

Closed MargaretSiple-NOAA closed 1 year ago

MargaretSiple-NOAA commented 1 year ago

In previous data reports, one appendix was a list of species, originally produced by Nate and checked by Jay Orr to make sure the species found are realistic. In the future, we want to automate step one of this process so that our taxonomy checkers (mainly @SarahFriedman-NOAA ) don't have to look through every species. Flags should be made for (and please edit these as we go to reflect what will make the most sense):

  1. Has the species been caught in that area before?
  2. Has the species been caught at that depth before?
  3. ....

The code for this should use historical data to generate flags. Output will be a list of "flagged" species, where they were caught (location and depth), and what the flag is.

MargaretSiple-NOAA commented 1 year ago

From Nate: There are instructions about creating the original Appendix B species list on the G Drive: G:\AI-GOA\Instructions&Procedures\Data Report\Species_encountered_2003_Append_B.

"The document addresses the issues of formatting the table but also of avoiding duplication of organisms where on one vessel they might have gone to species where another might have stopped at genus or even family if there's only one species we encounter within a family. Also, it's important to eliminate things like "skate egg cases" which aren't appropriate on this list."

nwraring commented 1 year ago

Hi Guys. I've been sitting on this since yesterday because I thought there was more I wanted to bring up regarding this list. I think I'm ready. Sorry this one little table keeps coming up but as I was saying to Sarah yesterday, once we've gotten these questions answered we can move forward in future reports without question. As you're aware, appendix B is a complete list of all taxa encountered on the survey. There's one table for fish and one table for inverts. In the past we've always put them in order from what was considered to be most primitive to most advanced. While most of the taxa have a number which describes where they fall in this order some of the taxa do not. At this point I have always passed the table on to our taxonomist (Jay or Duane have done this in the past) who then assigns numbers to the missing taxa so that they can be put in order with the rest of the organisms. When I passed this to Sarah this time she made a really interesting point that in the taxonomy world the issue of which organisms are more advanced than others is constantly in flux. In fact, some of the ordering we've done in the past is not necessarily still held as common knowledge now. That combined with the fact that most of the audience for this isn't necessarily that focused on the relative evolutionary status- we might be better off ordering this list alphabetically by a common descriptive term (eg "flatfish" for all members of the flatfish families. and then alphabetically by genus beyond that.)

The second issue I wanted to bring up is that we have examples where two of the records on the table describe the same organism except that one was only listed to genus and the other is to species (eg Metridium sp. and Metridium farcimen). In the past I think Jay made an educated guess on which ones to eliminate the "sp." version if there was only one sp within that genus- although I can't be certain of that unless I email him and find out. My question is should we be doing that this time or should we leave it as is in which case there are probably some redundancies in that regard.

I also wanted to include @vszalay on this discussion.

MargaretSiple-NOAA commented 1 year ago

This is related to #26 , so just noting that here. Sarah is working on a script for this the week of 1/16/23!

The final script will include a dataframe that is Appendix B (the species list), which can be checked and finalized, and a table or objects to describe the species richness of each of the families that was caught in the survey. That can be used for the Results ("151 fish species from 32 families and 436 invertebrate species or taxa from 13 phyla") and for the intro or whatever where we talk about species diversity.

SarahFriedman-NOAA commented 1 year ago

Script finalized and committed!

Currently product testing code to flag outliers. Planning to incorporate in one of the GAP survey R packages to be run at the end of each field season.

MargaretSiple-NOAA commented 1 year ago

Woohoo! Planning to test it this week.