AquaAuma / FishGlob_data

Database and methods related to the manuscript "An integrated database of fish biodiversity sampled with scientific bottom trawl surveys"
Creative Commons Attribution 4.0 International
21 stars 7 forks source link

Ed Lavender on fix (3) #17

Closed AquaAuma closed 11 months ago

AquaAuma commented 1 year ago

Species names. I noticed a few, possible, misspelt or outdated species names, based on queries to WORMS and other databases. I am not an expert in this area, however, so I am not sure if the following is correct?
Astrogordius cacaoticum – Astrogordius cacaoticus, see https://www.marinespecies.org/aphia.php?p=taxdetails&id=245844 Hemipteronotus martinicensis – Xyrichtys martinicensis, see https://www.marinespecies.org/aphia.php?p=taxdetails&id=311514 Hemipteronotus novacula – Xyrichtys novacula, see https://www.marinespecies.org/aphia.php?p=taxdetails&id=311518 Macrorhamphosus scolopax – Macroramphosus scolopax, see http://www.marinespecies.org/aphia.php?p=taxdetails&id=127378 Medinia beryllina – Melanoides tuberculata, see https://www.marinespecies.org/aphia.php?p=taxdetails&id=1554734 Thyraster serpentarius – Echinaster (Othilia) serpentarius, see http://www.marinespecies.org/aphia.php?p=taxdetails&id=178753

@jepa

jepa commented 11 months ago

The clean_taxa() function needs major updates because the worm package has been dropped. Currently working on this first, then I'll be able to address this issues

jepa commented 11 months ago

I am missing something, I don't see these names in the FishGlob_public_clean.RData. Do you @AquaAuma ? I need to know what survey it is to see if the improved clean_taxa() function fix it

AquaAuma commented 11 months ago

@edwardlavender which version of the dataset/columns did you use when you found this? Juliano cannot find the problem (see comment above) Thanks

edwardlavender commented 11 months ago

Thanks for looking into this one! I have re-checked the last-downloaded version of the dataset I have (2nd March 2023) and the one currently online and I also did not identify any records associated with the names above. Apologies for the confusion--I am not sure if this issue was fixed prior to 2nd March (I can't remember if I downloaded an earlier version of the dataset) or if there was an issue at my end. But as far as I can see this issue is resolved, either way!

load(url("https://github.com/AquaAuma/FishGlob_data/blob/main/outputs/Compiled_data/FishGlob_public_clean.RData?raw=true"))
data$accepted_name <- stringr::str_trim(data$accepted_name)
pbapply::pblapply(c(
    "Astrogordius cacaoticum",      # "Astrogordius cacaoticus",
    "Hemipteronotus martinicensis", # "Xyrichtys martinicensis",
    "Hemipteronotus novacula",      # "Xyrichtys novacula", 
    "Macrorhamphosus scolopax",     # "Macroramphosus scolopax",
    "Medinia beryllina",            # "Melanoides tuberculata",
    "Thyraster serpentarius"        # "Echinaster (Othilia) serpentarius"
  ), cl = 2L, function(sp) {
    a <- any(data$accepted_name == sp)
    b <- any(stringr::str_detect(data$accepted_name, sp))
    data.frame(species = sp, a = a, b = b)
  }) |> data.table::rbindlist()

                        species     a     b
1:      Astrogordius cacaoticum FALSE FALSE
2: Hemipteronotus martinicensis FALSE FALSE
3:      Hemipteronotus novacula FALSE FALSE
4:     Macrorhamphosus scolopax FALSE FALSE
5:            Medinia beryllina FALSE FALSE
6:       Thyraster serpentarius FALSE FALSE
AquaAuma commented 11 months ago

Perfect! Thanks for checking