MFEh2o / db

**Contains the main issue tracker for the MFE DB!** Functions for interacting with the MFE database, in script format. (See also MFEUtilities, which is an R package that includes many/most of the same functions).
1 stars 0 forks source link

OTU & FISH_INFO/FISH_DIETS taxa names need some work #17

Closed joneslabND closed 3 years ago

joneslabND commented 5 years ago

We need to make sure that the species column of the FISH_INFO table is cleaned up. There are multiple Northern Redbelly Dace entries and ideally all entries in this column would have a corresponding entry in the OTU table. Fixing this would help with the new entry tool!

ctsolomon commented 5 years ago

A related thing that I noticed in pulling together a summary of fish caught in Long - there are a couple fish recorded as "mud_minnow" instead of "central_mudminnow", though all are central mudminnow and should be recorded as such.

cdassow commented 5 years ago

currently we have general minnows (cyprinids only) we come across in fishScapes listed as "minnow" in the db but there isn't an abbreviation for it in the otu table so the new entry tool won't know what to do with it, I think force_species=TRUE should handle it for now but we may want something more long term?

ctsolomon commented 5 years ago

Can't we just add "cyprinidae" to the OTU table? Randi and I were talking about needing to do that for Long stuff anyway.

cdassow commented 5 years ago

I think that's good for the taxonomy if it's not already there but each species has a field abbreviation (LMB -> largemouth_bass) the entry tool fills in the full name (largemouth_bass based on a match with with an abbreviation (LMB) in the otu table. Right now there is no abbreviation for it to match to "minnow" we've just been writing "minnow" and that's how it shows up in the database. Again I think the built in work-around for new species can handle this for the time being but maybe it's worth it to come up with a field abbreviation for minnow, maybe not.

kaijagahm commented 4 years ago

Listed out the unique names in this google sheet: https://docs.google.com/spreadsheets/d/1K2b2u47y2y6OSbXYG6OVsZGf8jJcv4AGrVLms_OMrBw/edit?usp=sharing. Waiting on edits before I make corrections. Then will look into the abbreviations issue.

kaijagahm commented 4 years ago

112 fish in FISH_INFO have species == "none", but there is a value for fishNum, and it's not always 1. All other info is either NA or 0 (a few of them have 0 for fishLength). There are no comments. There are initials for caughtBy. What does this mean? Is it supposed to be NFC, and if so, why are there multiple entries for some sampleID's? All of these come from 2019.

kaijagahm commented 4 years ago

Need habitat (littoral, pelagic, or unknown) for the following:

cdassow commented 4 years ago

I think I can clear up the 'none' s from 2019. mind sending me a .csv withe the data so I can make sure what I'm thinking is right?

for the habitat types

kaijagahm commented 4 years ago

Ok, made those fixes. Some lingering questions:

  1. The "none" situation [Update 9/22: we've resolved this. Talked pretty extensively to Chris and Colin and we've determined that we can delete those "none" rows and just make sure to note in the metadata that the units are anglerHours. We're adding an nAnglers column to FISH_SAMPLES (so that each person doesn't have to use regex to get the number of anglers out of the crew column), and then you can use that to calculate per-angler CPUE. My next step here is to check the fishscapes documentation and figure out whether there's already a description of how to make those calculations, or not yet. @Randinotte can maybe help me figure out which file to check here.]

  2. Do we need/want an abbreviation for all fish in OTU/FISH_INFO? Currently only some fish in OTU have an abbreviation. From @cdassow's comments above, it sounds like the fish entry tool takes an abbreviation and translates it into the full fish name, but that the fishscapes people have been using "minnow" as the de facto abbreviation for "minnow" (general Cyprinids). Now that I've added a row for cyprinids in OTU, I can just put "minnow" as the abbreviation there, or we can agree to standardize it to something like "MNW".

But I guess what I'm trying to ask is: which leads the other, database decisions or field practices? Happy to follow the latter in creating the former if that would be helpful for data entry here.

  1. There are some entries in OTU (e.g. bluegill_yoy, crayfish_yoy) where "yoy" is part of the name/otu. That seems a little strange--shouldn't age/life stage information be recorded separately, not as part of the species name? Do we want to maintain this formatting, or change it?
kaijagahm commented 4 years ago

Updated the fishscapes documentation "Fishscapes.Angling.20180625.docx" with details on how to calculate per-angler CPUE. Added an nAnglers column (only filled it in for angler_hours samples).

Re: (3)--this will be partially addressed in our discussion at the 9/30 meeting regarding OTU for bug names. See #31

kaijagahm commented 4 years ago
  1. All set.
  2. No, we do not need an abbreviation for all fish. Database follows field conventions.
  3. Was not addressed. Still need to talk.
kaijagahm commented 3 years ago

Circling back to this issue, finally. Going to pick up this script where I left off.

One other thing to note is that we should change the "species" column in FISH_INFO to "otu", because not all the entries are actually species. Will this cause any problems in the rest of the pipeline/in the entry tool?

Want to also address FISH_DIETS as part of dealing with this issue.

cdassow commented 3 years ago

I think all that will have to happen is substituting the new column name in for species in the entry tool script, the contents of that column won't really change I don't think so I don't see it affecting much.

kaijagahm commented 3 years ago

Have not fully cleaned up OTU yet, but I think I have basically cleaned up FISH_INFO. After the next update, the species column will be called otu.

These are the unique values for the soon-to-be otu column in FISH_INFO.

 [1] "beaver"                      "black_bullhead"              "black_crappie"              
 [4] "blackchin_shiner"            "blacknose_shiner"            "bluegill"                   
 [7] "bluegill_pumpkinseed_hybrid" "bluntnose_minnow"            "bowfin"                     
[10] "brassy_minnow"               "brook_stickleback"           "brook_trout"                
[13] "bullhead"                    "central_mudminnow"           "common_shiner"              
[16] "crayfish"                    "dace"                        "fathead_minnow"             
[19] "finescale_dace"              "fish_unidentifiable"         "golden_shiner"              
[22] "iowa_darter"                 "johnny_darter"               "largemouth_bass"            
[25] "minnow"                      "muskellunge"                 "northern_pike"              
[28] "northern_redbelly_dace"      "pickerel"                    "pumpkinseed"                
[31] "rainbow_darter"              "redhorse"                    "rock_bass"                  
[34] "salamander"                  "sculpin"                     "shiner"                     
[37] "slimy_sculpin"               "smallmouth_bass"             "striped_shiner"             
[40] "tadpole"                     "turtle"                      "unknown"                    
[43] "walleye"                     "white_sucker"                "yellow_perch"

@ctsolomon @joneslabND @cdassow do any of you want to raise concerns with any of these names, or do they look good? For example, I'm assuming that terms like "dace", "shiner", and "minnow" (as per our convo above) are deliberately vague, indicating times when the anglers could not make a more specific ID (hence why we're changing "species" to "otu"). But if there are any cases where those vague terms are just shorthands for what we know to be one particular species, do let me know.

Once we're happy with this list of names, my next steps will be:

Then can close this issue.

joneslabND commented 3 years ago

We should remove beaver, salamander, tadpole, and turtle rows from the database, please. I also wonder if we ever catch any. sculpin that is not a slimy sculpin. I'm a bit skeptical of the brook trout...

ctsolomon commented 3 years ago

Becker (Fishes of Wisconsin) indicates that either slimy or mottled sculpin might occur in places we sample. I can’t recall ever handling a sculpin in northern WI myself, so I don’t have any intuition about whether one or both of those is more likely to be right. Kaija, can you tell us what people (from crew column) are associated with slimy sculpin and sculpin records? We might either change all of those to just sculpin, or leave slimy as slimy and sculpin as sculpin.

Brook trout conceivable I think depending on where it came from. Can you tell us location and crew of any brook trout records?

Is there an important distinction between “fish_unidentifiable” and “unknown”?

I’m a little skeptical about striped shiner – I am not familiar with them, and Becker says distribution is “Rock River and Illinois and Fox River watersheds of the Mississippi River basin and in the Lake Michigan basin.” Kaija, can you tell us crew for any records of that species?

From: Stuart Jones @.> Sent: Tuesday, April 13, 2021 7:43 AM To: MFEh2o/db @.> Cc: Chris Solomon @.>; Mention < @.> Subject: Re: [MFEh2o/db] OTU & FISH_INFO/FISH_DIETS taxa names need some work (#17)

We should remove beaver, salamander, tadpole, and turtle rows from the database, please. I also wonder if we ever catch any. sculpin that is not a slimy sculpin. I'm a bit skeptical of the brook trout...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/MFEh2o/db/issues/17#issuecomment-818669971, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB74VZFTQIC6CVWDPQMPLTTTIQU3JANCNFSM4HEKIJ3Q .[image: Image removed by sender.]

kaijagahm commented 3 years ago

Sculpin/slimy sculpin

Brook trout:

Fish unidentifiable:

Striped shiner:

joneslabND commented 3 years ago

I think mottled sculpin are more common in streams, but given crews and lakes I think it is ok to leave sculpin as sculpin.

Brook trout in those lakes (or at least associated creeks) seem ok

I think unknowns were definitely fish, unless they were tadpoles. How many unknowns vs. fish_unidentifiable do we have? I guess this is messier if this list includes diet information...

Not sure on striped shiner...

cdassow commented 3 years ago

The only thing I have to add is that "fish_unidentifiable" may be a caryover from diet data since that's an otu we use in diets and "uknown" is likely coming right from field samples where a fish was caught in a net, electrofishing, angling.

Ahhh Stuart beat me to it!

kaijagahm commented 3 years ago

@cdassow @joneslabND the list I have above is only from FISH_INFO; it doesn't include diet items. Is that what you were asking?

There are 11 fish in FISH_INFO labeled as "fish_unidentifiable". All but two are from minnow traps; the other two are from a fyke net and electrofishing. Dates range form 2013 through 2019. Various lakes.

There are 16 fish labeled as "unknown", and I don't see any comments that make me question whether they're something other than fish. No particular pattern in the dates, lakes, or metadataID's there.

ctsolomon commented 3 years ago

Sounds to me like we can collapse fish_unidentifiable and unknown – maybe call them both fish_unidentifiable.

Striped shiner – why don’t we leave it as that.

I think we’ve resolved all of the questions now but if there’s anything outstanding let us know!

From: Kaija Gahm @.> Sent: Tuesday, April 13, 2021 11:23 AM To: MFEh2o/db @.> Cc: Chris Solomon @.>; Mention < @.> Subject: Re: [MFEh2o/db] OTU & FISH_INFO/FISH_DIETS taxa names need some work (#17)

@cdassow https://github.com/cdassow @joneslabND https://github.com/joneslabND the list I have above is only from FISH_INFO; it doesn't include diet items. Is that what you were asking?

There are 11 fish in FISH_INFO labeled as "fish_unidentifiable". All but two are from minnow traps; the other two are from a fyke net and electrofishing. Dates range form 2013 through 2019. Various lakes.

There are 16 fish labeled as "unknown", and I don't see any comments that make me question whether they're something other than fish. No particular pattern in the dates, lakes, or metadataID's there.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/MFEh2o/db/issues/17#issuecomment-818822880, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB74VZAWT4MRRKFELR5CT3DTIROVJANCNFSM4HEKIJ3Q .[image: Image removed by sender.]

kaijagahm commented 3 years ago

Sounds good to me. Thanks!

kaijagahm commented 3 years ago

Okay I'm finally revisiting this issue. @joneslabND @ctsolomon I have a few lingering questions I'm hoping you can help me clear up.

1) We never fully resolved the issue of including life stage information as part of the species/otu name (e.g. "bluegill_yoy" or "largemouth_bass_larvae"). Earlier in this issue, I had written that we'd talk about that while standardizing the benthic invert taxonomy, but CTS and I worked on that a while back, and our solution was to just add a pupa column (T/F) and not include a "larvae" designation. That's not a solution that will work for the fish tables, so this question is still outstanding.

Specifically, here are the otu's that have grouping == "fish" and include life stage information:

Note that all of these are taken from OTU, not from FISH_INFO. None of them show up in FISH_INFO. So I presume that these are diet items.

How, if at all, do you want me to handle these? Will the fish entry tool ever have to deal with life stage information like this? If so, should I assign these abbreviations and incorporate them into the tool? Or should we just leave it as is and trust that people won't use this notation for the fish entry tool, and that the life stages will just get incorporated via diets?

2) We do currently have an entry for just "pickerel" in OTU. But in the fish entry tool, any fish entered as "pickerel" get changed automatically to "grass pickerel". Does that imply that I should go back and change the "pickerel" entry in OTU to "grass pickerel" (or rather, remove "pickerel" since there's already an entry for "grass pickerel")? Or do we want to deliberately keep the ambiguous "pickerel" entry for backwards-compatibility? There is one row in FISH_INFO, from 2018, that has a fish recorded just as "pickerel".

Thanks in advance for your help with this! We are almost done with this taxonomy issue, I think.

ctsolomon commented 3 years ago

Re point 1: yes, these OTUs would only show up in diet data, not in fish data. The fish entry tool should never have to deal with OTUs that include life stage info like this. I think it’s ok to leave them as is in the OTU table.

Re point 2:

According to Becker (Fishes of Wisconsin), grass pickerel is the only species of pickerel found in Wisconsin. So I think it’s ok to remove the “pickerel” row from OTU and just rely on the “grass pickerel” row. The one 2018 record in FISH_INFO for a pickerel is from a WI (or nearby upper peninsula of Michigan) lake, is that correct?

I think it’s reasonable to continue to have the entry tool convert “pickerel” to “grass pickerel”, although I suppose another, maybe better, behavior would be to throw whatever error we throw when a species is entered that isn’t in the list of accepted species; this might force someone to switch to grass pickerel.

From: Kaija Gahm @.> Sent: Tuesday, May 25, 2021 1:15 PM To: MFEh2o/db @.> Cc: Chris Solomon @.>; Mention < @.> Subject: Re: [MFEh2o/db] OTU & FISH_INFO/FISH_DIETS taxa names need some work (#17)

Okay I'm finally revisiting this issue. @joneslabND https://github.com/joneslabND @ctsolomon https://github.com/ctsolomon I have a few lingering questions I'm hoping you can help me clear up.

  1. We never fully resolved the issue of including life stage information as part of the species/otu name (e.g. "bluegill_yoy" or "largemouth_bass_larvae"). Earlier in this issue, I had written that we'd talk about that while standardizing the benthic invert taxonomy, but CTS and I worked on that a while back, and our solution was to just add a pupa column (T/F) and not include a "larvae" designation. That's not a solution that will work for the fish tables, so this question is still outstanding.

Specifically, here are the otu's that have grouping == "fish" and include life stage information:

Note that all of these are taken from OTU, not from FISH_INFO. None of them show up in FISH_INFO. So I presume that these are diet items.

How, if at all, do you want me to handle these? Will the fish entry tool ever have to deal with life stage information like this? If so, should I assign these abbreviations and incorporate them into the tool? Or should we just leave it as is and trust that people won't use this notation for the fish entry tool, and that the life stages will just get incorporated via diets?

  1. We do currently have an entry for just "pickerel" in OTU. But in the fish entry tool, any fish entered as "pickerel" get changed automatically to "grass pickerel". Does that imply that I should go back and change the "pickerel" entry in OTU to "grass pickerel" (or rather, remove "pickerel" since there's already an entry for "grass pickerel")? Or do we want to deliberately keep the ambiguous "pickerel" entry for backwards-compatibility? There is one row in FISH_INFO, from 2018, that has a fish recorded just as "pickerel".

Thanks in advance for your help with this! We are almost done with this taxonomy issue, I think.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/MFEh2o/db/issues/17#issuecomment-848060710, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB74VZA3GQE3VFHGCKM4QMDTPPLJNANCNFSM4HEKIJ3Q .[image: Image removed by sender.]

kaijagahm commented 3 years ago

Roger all of that. Yes, the previous pickerel was caught in WI. I'll proceed without changing the life-stage names in OTU.

kaijagahm commented 3 years ago

Resolved with version 4.6.0.