ojalaquellueva / TNRSapi

API wrapper for TNRS batch application
Other
4 stars 2 forks source link

Partial matches not found #7

Closed ojalaquellueva closed 2 years ago

ojalaquellueva commented 2 years ago

Partial match to correctly spelled genus not found if species name does not match.

Originally reported in this repo as issues 4, 5 and 6. Also reported in RTNRS repository (https://github.com/EnquistLab/RTNRS/issues/6) as issue #6 "Sometimes fail to detect valid Genus due to wrong epithet".

ojalaquellueva commented 2 years ago

Reproducible example in bash

# Set to your working directory. Only parameter you should need to change
WD="/home/bien/tnrs/admin/bugs/partial_match"

URL="https://tnrsapi.xyz/tnrs_api.php"  
MODE="resolve"
SOURCES="tropicos,wfo,usda"
CLASS="tropicos"
MATCHES="best"

cd $WD
cat << EOT > partial_match_bug_test_with_id.csv
id,species
1,"Connarus venezuelanus"
2,"Connarus venezuelensis"
3,"Croton antisyphiliticus"
4,"Croton antisiphyllitius"
5,"Connarus sp.1"
6,"Connarus"
7,"Connaraceae Connarus absurdus"
8,"Connarus absurdus"
9,"Connaraceae Badgenus badspecies"
10,"Rosaceae Badgenus badspecies"
EOT

opts=$(jq -n \
  --arg mode "$MODE" \
  --arg sources "$SOURCES" \
  --arg class "$CLASS" \
  --arg matches "$MATCHES" \
  '{"mode": $mode, "sources": $sources, "class": $class, "matches": $matches}')
data=$(csvjson partial_match_bug_test_with_id.csv)
req_json='{"opts":'$opts',"data":'$data'}'
resp_json=$(curl -X POST \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "charset: UTF-8" \
  -d "$req_json" \
  "$URL" \
  )

echo "$resp_json" | jq '.[] | .Name_submitted + ", " + .Name_matched' | tr -d '\"' | column -t -s","

Output:

Connarus venezuelensis            [No match found]
Croton antisyphiliticus           Croton antisyphiliticus
Croton antisiphyllitius           [No match found]
Connarus sp.1                     Connarus
Connarus                          Connarus
Connaraceae Connarus absurdus     Connarus
Connarus absurdus                 [No match found]
Connaraceae Badgenus badspecies   [No match found]
Rosaceae Badgenus badspecies      [No match found]
ojalaquellueva commented 2 years ago

Issue was with code that filters custom match accuracy parameter. Fix committed to development for testing.

Verifying in bash:

WD="/home/bien/tnrs/admin/bugs/partial_match"
# Using development instance
URL="http://vegbiendev.nceas.ucsb.edu:8975/tnrs_api.php"
MODE="resolve"
SOURCES="tropicos,wfo,usda"
CLASS="tropicos"
MATCHES="best"

cd $WD

# Add couple more test names to verify true non-matches
cat << EOT > partial_match_bug_test_with_id.csv
id,species
1,"Connarus venezuelanus"
2,"Connarus venezuelensis"
3,"Croton antisyphiliticus"
4,"Croton antisiphyllitius"
5,"Connarus sp.1"
6,"Connarus"
7,"Connaraceae Connarus absurdus"
8,"Connarus absurdus"
9,"Connaraceae Badgenus badspecies"
10,"Rosaceae Badgenus badspecies"
11,"Badgenus badspecies"
12,"Totalnonsenseaceae Badgenus badspecies"
EOT

opts=$(jq -n \
  --arg mode "$MODE" \
  --arg sources "$SOURCES" \
  --arg class "$CLASS" \
  --arg matches "$MATCHES" \
  '{"mode": $mode, "sources": $sources, "class": $class, "matches": $matches}')
data=$(csvjson partial_match_bug_test_with_id.csv)
req_json='{"opts":'$opts',"data":'$data'}'
resp_json2=$(curl -X POST \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "charset: UTF-8" \
  -d "$req_json" \
  "$URL" \
  )

echo "$resp_json2" | jq '.[] | .Name_submitted + ", " + .Name_matched' | tr -d '\"' | column -t -s","

Output:

Connarus venezuelanus                    Connarus venezuelanus
Connarus venezuelensis                   Connarus
Croton antisyphiliticus                  Croton antisyphiliticus
Croton antisiphyllitius                  Croton
Connarus sp.1                            Connarus
Connarus                                 Connarus
Connaraceae Connarus absurdus            Connarus
Connarus absurdus                        Connarus
Connaraceae Badgenus badspecies          Connaraceae
Rosaceae Badgenus badspecies             Rosaceae
Badgenus badspecies                      [No match found]
Totalnonsenseaceae Badgenus badspecies   [No match found]

All results now as expected