RBGKew / pykew

Python library for accessing Kew's data services
30 stars 4 forks source link

Filters.accepted possibly missing things #2

Open barnabywalker opened 5 years ago

barnabywalker commented 5 years ago

I've been using pykew.powo to check if names are accepted and came across a species where the accepted filter returns nothing, but doing the search without the filter returns just one entry which is an accepted name:

import pykew.powo as powo
from pykew.powo_terms import Name, Filters

query = {Name.full_name: "Justicia fragilis"}
result = powo.search(query, filters=[Filters.accepted])
print(result.size())

Gives the output:

0

But:

result = powo.search(query)
print([line for line in result])

Gives the output:

[{'accepted': True,
  'author': 'Dennst.',
  'kingdom': 'Plantae',
  'family': 'Acanthaceae',
  'name': 'Justicia fragilis',
  'rank': 'Species',
  'url': '/taxon/urn:lsid:ipni.org:names:50746-1',
  'fqId': 'urn:lsid:ipni.org:names:50746-1',
  'images': [{'thumbnail': 'http://d2seqvvyy3b8p2.cloudfront.net/4516ed32168292fca272c4e18731ac33.jpg',
    'fullsize': 'http://d2seqvvyy3b8p2.cloudfront.net/23a120025068128520851f1e55ec3bf8.jpg',
    'caption': "A specimen from Kew's Herbarium"}]}]

Although, maybe this isn't a problem with the API?

jiacona commented 5 years ago

Ah, this is because Justicia fragilis is unplaced. If you look at it's full record:

{
"modified": "2019-01-22T00:00:00.000Z",
"bibliographicCitation": "IPNI 2019. Published on the Internet http://www.ipni.org; WCSP 2019. WCSP. Facilitated by the Royal Botanic Gardens, Kew. Published on the Internet; http://apps.kew.org/wcsp/ Retrieved 2011 onwards",
"genus": "Justicia",
"taxonomicStatus": "Unplaced",
"kingdom": "Plantae",
"phylum": "Magnoliophyta",
"family": "Acanthaceae",
"nomenclaturalCode": "Botanical",
"source": "kew.org:az:reference:330981",
"namePublishedInYear": 1818,
"nomenclaturalStatus": "Doubtful",
"synonym": false,
"plantae": true,
"fungi": false,
"fqId": "urn:lsid:ipni.org:names:50746-1",
"name": "Justicia fragilis",
"authors": "Dennst.",
"species": "fragilis",
"rank": "SPECIES",
"reference": "Schlüssel Hortus Malab.: 32 (1818)",
"classification": [{
        "fqId": "urn:lsid:ipni.org:names:50746-1",
        "name": "Justicia fragilis",
        "author": "Dennst.",
        "rank": "SPECIES",
        "taxonomicStatus": "Unplaced"
    }]
}

You will see its taxonomic status is Unplaced. The accepted flag in the search results is, somewhat confusingly, more of a "not a synonym" flag. Many unplaced names could actually be accepted, they just haven't been reviewed yet. These will become less and less frequent over time.

However, if you use the accepted filter, you will only get taxa that have been reviewed and are definitively accepted.

This is all not very obvious though so I'll try to make it more clear from the data returned what is actually going on.

barnabywalker commented 5 years ago

Ahhh okay that makes sense, thanks for pointing that out.

Maybe it just needs something in the documentation clarifying the difference between the two?