EnquistLab / RTNRS

R package for the (plant) Taxonomic Name Resolution Service
https://bien.nceas.ucsb.edu/bien/tools/tnrs/
Other
8 stars 0 forks source link

Sometimes fail to detect valid Genus due to wrong epithet #6

Closed paternogbc closed 5 months ago

paternogbc commented 2 years ago

Hi, thanks a lot for making this great package.

I noticed that sometimes the TNRS function fails to match an existing Genus simple due to a wrong epithet. I am not sure why it is happening and I could not find a patterns but it is happening in different situations. You can find some reproducible examples below.

library(TNRS)

sp_list <- c(
  "Connarus venezuelanus",  "Connarus venezuelensis",
  "Croton antisyphiliticus", "Croton antisiphyllitius"
  )
res <- TNRS(sp_list)
cbind(res$Genus_submitted, res$Name_matched)
#>      [,1]       [,2]                     
#> [1,] "Connarus" "Connarus venezuelanus"  
#> [2,] "Connarus" "[No match found]"       
#> [3,] "Croton"   "Croton antisyphiliticus"
#> [4,] "Croton"   "[No match found]"
bmaitner commented 2 years ago

Thanks for letting us know, @paternogbc ! @ojalaquellueva any ideas?

ojalaquellueva commented 2 years ago

@bmaitner @paternogbc Thanks for putting this back on my radar. I too had noticed a problem with partial matches failing but got distracted by urgent server upgrades. I'm making this a high-priority issue and will get on it as soon as the upgrade is finished.

paternogbc commented 2 years ago

Many thanks @ojalaquellueva and @bmaitner for the fast reply. I am happy to gather other examples where this issue is happening. Just let me know in case you think it is useful to your debugging.

ojalaquellueva commented 2 years ago

Opened issue with TNRS core service repository: https://github.com/ojalaquellueva/TNRSbatch/issues/4

ojalaquellueva commented 2 years ago

Cannot replicate problem with core services. Problem may be with API. Closed: https://github.com/ojalaquellueva/TNRSbatch/issues/4.

ojalaquellueva commented 2 years ago

Problem is with API. Opened new issue in TNRS API repository: https://github.com/ojalaquellueva/TNRSapi/issues/7.

ojalaquellueva commented 2 years ago

@bmaitner, @paternogbc: API fixed & issue closed. See https://github.com/ojalaquellueva/TNRSapi/issues/7 for details. Please check from TNRS R package as well and close this issue if all problems have been addressed.

paternogbc commented 2 years ago

Hi @ojalaquellueva, I have just installed the last RTNRS version from github and tested my first code. But the problem seems to persist at the R end. See reproducible example below:

library(TNRS)
sp_list <- c(
  "Connarus venezuelanus",  "Connarus venezuelensis",
  "Croton antisyphiliticus", "Croton antisiphyllitius"
)
res <- TNRS(sp_list)
cbind(res$Genus_submitted, res$Name_matched)
#>      [,1]       [,2]                     
#> [1,] "Connarus" "Connarus venezuelanus"  
#> [2,] "Connarus" "[No match found]"       
#> [3,] "Croton"   "Croton antisyphiliticus"
#> [4,] "Croton"   "[No match found]"
ojalaquellueva commented 2 years ago

Hi @paternogbc. Sorry, my bad. I forgot to update production. It's working now. Give it a try.

paternogbc commented 2 years ago

@ojalaquellueva thanks a lot! This is super helpful!

paternogbc commented 2 years ago

Hi there, sorry to re-open this, but I am encountering something that might be related to the same issue.

When I submit genus only TNRS fails to find a match but when both Genus epithet are submitted the species and genus is matched. Any ideas why?

Many thanks!

A reproducible example can be found below:

library(tidyverse)
library(TNRS)
TNRS(taxonomic_names = "Austrodanthonia", sources = "wfo") %>% 
  select(Name_submitted, Genus_submitted, Genus_matched, Accepted_name)
#>    Name_submitted Genus_submitted Genus_matched Accepted_name
#> 1 Austrodanthonia Austrodanthonia
TNRS(taxonomic_names = "Austrodanthonia caespitosa", sources = "wfo") %>% 
  select(Name_submitted, Genus_submitted, Genus_matched, Accepted_name)
#>               Name_submitted Genus_submitted   Genus_matched
#> 1 Austrodanthonia caespitosa Austrodanthonia Austrodanthonia
#>              Accepted_name
#> 1 Rytidosperma caespitosum

Created on 2022-08-15 by the reprex package (v2.0.1)

ojalaquellueva commented 2 years ago

Checking API:

# TNRS API URLs
# Production
proURL="https://tnrsapi.xyz/tnrs_api.php"
# Private development
prdURL="http://vegbiendev.nceas.ucsb.edu:8975/tnrs_api.php"
# Public development
pudURL="http://vegbiendev.nceas.ucsb.edu:9975/tnrs_api.php" 

# Set working directory & API instance to test
# These are only parameters you should need to adjust
WD="/home/bien/tnrs/admin/bugs/partial_match_2"
URL=$proURL

# TNRS options
MODE="resolve"
SOURCES="tropicos,wfo,wcvp,usda"
CLASS="tropicos"
MATCHES="best"

cd $WD
cat << EOT > partial_match_bug_2_test_data.csv
id,species
1,"Austrodanthonia caespitosa"
2,"Austrodanthonia"
EOT

echo "Using TNRS endpoint: ${URL}"

opts=$(jq -n \
  --arg mode "$MODE" \
  --arg sources "$SOURCES" \
  --arg class "$CLASS" \
  --arg matches "$MATCHES" \
  '{"mode": $mode, "sources": $sources, "class": $class, "matches": $matches}')
data=$(csvjson partial_match_bug_2_test_data.csv)
req_json='{"opts":'$opts',"data":'$data'}'
resp_json=$(curl -X POST \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "charset: UTF-8" \
  -d "$req_json" \
  "$URL" \
  )

# Echo the main results fields of interest to terminal
echo "$resp_json" | jq '"Name_submitted,Name_matched,Accepted_name", (.[] | .Name_submitted + "," + .Name_matched + "," + .Accepted_name)' | tr -d '\"' | column -t -s","

Results:

Name_submitted              Name_matched                Accepted_name
Austrodanthonia caespitosa  Austrodanthonia caespitosa  Rytidosperma caespitosum
Austrodanthonia             Austrodanthonia             Rytidosperma
ojalaquellueva commented 2 years ago

@paternogbc @bmaitner I cannot replicate the issue. API is working as expected. Problem must be with the TNRS R package.

bmaitner commented 1 year ago

@ojalaquellueva I think maybe you failed to replicate the issue with the API because you did not restrict the sources as @paternogbc did. If I re-run the equivalent of your query in the R package, I get the same results you did, Brad. However, if I restrict the sources to only "wfo", I run into the bug. Note that the issue doesn't occur with all genera, as running a query for the genus "Acer" while restricting to only "wfo" works as expected.

library(tidyverse)
library(TNRS)

TNRS(taxonomic_names = "Austrodanthonia") %>% 
  select(Name_submitted, Genus_submitted, Genus_matched, Accepted_name)
#>    Name_submitted Genus_submitted Genus_matched Accepted_name
#> 1 Austrodanthonia Austrodanthonia

TNRS(taxonomic_names = "Austrodanthonia caespitosa", sources = "wfo") %>% 
  select(Name_submitted, Genus_submitted, Genus_matched, Accepted_name)
#>               Name_submitted Genus_submitted   Genus_matched
#> 1 Austrodanthonia caespitosa Austrodanthonia Austrodanthonia
#>              Accepted_name
#> 1 Rytidosperma caespitosum
#> 

# works fine

  TNRS(taxonomic_names = "Austrodanthonia") %>% 
    select(Name_submitted, Genus_submitted, Genus_matched, Accepted_name)

  #> Name_submitted Genus_submitted   Genus_matched Accepted_name
  #> 1 Austrodanthonia Austrodanthonia Austrodanthonia  Rytidosperma

# doesn't work

  TNRS(taxonomic_names = "Austrodanthonia", sources = "wfo") %>% 
    select(Name_submitted, Genus_submitted, Genus_matched, Accepted_name)

  #> Name_submitted Genus_submitted Genus_matched Accepted_name
  #> 1 Austrodanthonia Austrodanthonia 

# works fine

  TNRS(taxonomic_names = "Acer", sources = "wfo") %>% 
    select(Name_submitted, Genus_submitted, Genus_matched, Accepted_name)

#>  Name_submitted Genus_submitted Genus_matched Accepted_name
#>  1           Acer            Acer          Acer          Acer
ojalaquellueva commented 5 months ago

Just noticed this old issue still kicking around. I still cannot replicate the issue, regardless of which or how many sources I use.

$ wd="/Users/bboyle/Documents/bien/tnrs/test"
$ f_in="testnames"
$ cat << EOT > $f_in
id,species
1,"Austrodanthonia"
2,"Austrodanthonia caespitosa"
3,"Acer"
EOT

$ tnrsapi.sh -f "${wd}${f_in}" -s wfo

Names submitted:
| id | species                    |
| -- | -------------------------- |
|  1 | Austrodanthonia            |
|  2 | Austrodanthonia caespitosa |
|  3 | Acer                       |

Processing with TNRS API @ 'https://tnrsapi.xyz'
Full URL: 'https://tnrsapi.xyz/tnrs_api.php'

Name resolution results:
| Name_submitted             | Name_matched               | Overall_score | Taxonomic_status | Accepted_name            | Accepted_name_author      | Source |
| -------------------------- | -------------------------- | ------------- | ---------------- | ------------------------ | ------------------------- | ------ |
| Austrodanthonia            | Austrodanthonia            |          True | Synonym          | Rytidosperma             | Steud.                    | wfo    |
| Austrodanthonia caespitosa | Austrodanthonia caespitosa |          True | Synonym          | Rytidosperma caespitosum | (Gaudich.) Connor & Edgar | wfo    |
| Acer                       | Acer                       |          True | Accepted         | Acer                     | L.                        | wfo    |

Perhaps it was accidentally fixed by a later update?

ojalaquellueva commented 5 months ago

Closing as resolved.