Closed hohonuuli closed 1 week ago
I put together the following python script to check for a circular relationship in the worms tree:
#!/usr/bin/env python
import requests
import sys
parents = []
base_url = "http://database.fathomnet.org:8888/parent/"
def get_parent(name):
url = base_url + name
try:
r = requests.get(url)
n = r.json()
print(n)
if n in parents:
print("Circular relation ... " + n + " already in ancestor tree")
else:
parents.append(n)
get_parent(n)
except:
print("done")
def main(n):
parents.append(n)
get_parent(n)
if __name__ == '__main__':
print("Walking up the tree, one parent at a time")
name = sys.argv[1]
print(name)
main(name)
Running it produced:
❯ ./walk_parents 'Smithsonius dorothea'
Walking up the tree, one parent at a time
Smithsonius dorothea
Smithsonius
Tessaradomidae
Lepralielloidea
Flustrina
Flustridae
Flustroidea
Flustrina
Circular relation ... Flustrina already in ancestor tree
I grepped the taxon.txt file from worms for Flustrina and attached the output. I think the issue is the following two lines (I edited them for brevity). There's two rows with the same scientific name (but different accepted names and aphids)
taxonID scientificNameID acceptedNameUsageID parentNameUsageID namePublishedInID scientificName acceptedNameUsage parentNameUsage
urn:lsid:marinespecies.org:taxname:153575 urn:lsid:marinespecies.org:taxname:153575 urn:lsid:marinespecies.org:taxname:153575 sid:marinespecies.org:taxname:110722 Flustrina Flustrina Cheilostomatida
urn:lsid:marinespecies.org:taxname:759713 urn:lsid:marinespecies.org:taxname:759713 urn:lsid:marinespecies.org:taxname:110909 urn:lsid:marinespecies.org:taxname:110749 Flustrina Carbasea Flustridae
We use the scientific name to help us resolve former names of taxa. As an example ran cat taxon.txt | grep 'Loligo opalescens'
which returned a single row.
taxonID scientificNameID acceptedNameUsageID parentNameUsageID namePublishedInID scientificName
urn:lsid:marinespecies.org:taxname:341883 urn:lsid:marinespecies.org:taxname:341883 urn:lsid:marinespecies.org:taxname:574540 urn:lsid:marinespecies.org:taxname:138139 Loligo opalescens Doryteuthis opalescens Loligo
It looks like the issue is in Data.scala. Some issues to resolve:
id
, then use that for node resolution. Currently we look up by name.@BGWoodward @kevinsbarnard This is resolved in worms-server 0.7.0. I've deployed the changes to production. Be aware that, duplicate names that aren't accepted names (i.e. the taxa was re-named) will have a number appended so one of the Flustrina
is now Flustrina 1
From @kevinsbarnard: