OpenTreeOfLife / feedback

No code -- just an issue tracker for general feedback (sent here via GitHub's issues API)
1 stars 0 forks source link

Family a mess, full of synonymies #373

Open opentreeapi opened 6 years ago

opentreeapi commented 6 years ago

It appears the phylogeny and taxonomy of this family has been merged sloppily from multiple sources using three different taxonomic schemes: the old "lumped" style placing many species in the genus Piophila with many subgenera; the scheme proposed by McAlpine (1977) with many genera, including the large genera Neopiophila Parapiophila, with Piophila itself containing only one or two species (mostly in use in North America); and the revision to McAlpine's scheme by Ozerov (2004), synonymizing several genera, notably Parapiophila with Allopiophila (mostly in use in Europe).

Because of this, some species have been included two or even three times under different genera (e.g. Allopiophila atrifrons = Parapiophila atrifrons = Piophila atrifrons). This makes the tree indecipherable. I don't really have a horse in this race, as far as which scheme you choose, but you cannot have all three.

The suprageneric classification is similarly confused. Presumably because Ozerov and McAlpine used similar names in slightly different ways, the clade corresponding to the former family Thyreophoridae (= Thyreophorina sensu McAlpine = Thyreophorini sensu Ozerov), containing Thyreophora, Centrophlebomyia, Dasyphlebomyia, Bocainmyia, Piophilosoma and Protothyreophora has been split up within and without Piophilinae. Furthermore, the small subfamily Neottiophilidae is missing its type genus Neottiophilum.

However, both of these classification schemes are rather old, and not phylogenetic (although McAlpine 1977 does contain an "intuitive" cladistic dendrogram, which you might consider here). A more recent phylogenetic analysis of the family was conducted a few years ago by Rochefort (2015) as a MSc thesis project. It confirms and challenges aspects of both earlier taxonomic schemes. I do not know what the OToL policy is on non peer reviewed work, but I have linked it below.

McAlpine, J.F. 1977. A revised classification of the Piophilidae, including ‘Neottiophilidae’ and ‘Thyreophoridae’ (Diptera: Schizophora). The Memoirs of the Entomological Society of Canada 103: 1-66.

Ozerov, A.L. 2004. On the classification of the family Piophilidae (Diptera). Entomological Review 84: 600 - 608.

Rochefort, S. 2015. Taxonomy and phylogeny of Piophilidae (Diptera). MSc Thesis. McGill University. http://digitool.library.mcgill.ca/webclient/StreamGate?folder_id=0&dvs=1515530723005~382&usePid1=true&usePid2=true

================================================ Metadata Do not edit below this line
Author Christopher Angell
Upvotes 0
URL tree.opentreeoflife.org/opentree/opentree9.1@ott1007135/Piophilidae
Target node label Piophilidae
Synthetic tree id opentree9.1
Synthetic tree node id ott1007135
Source tree id(s)
Open Tree Taxonomy id
Supporting reference None
kcranston commented 6 years ago

Thanks for the detailed description. Couple of comments: all dotted branches in the tree are from taxonomy, and all solid branches are from phylogeny. So, except for the one solid branch Piophilidae -> Piophilinae -> Mycetaulus that is from Wiegmann 2011, this is all coming from synthesis of NCBI, GBIF, IRMNG, with priority given to NCBI. We don't provide services for editing the tree / taxonomy directly, so there are two ways to fix this:

  1. Work with one of out taxonomy sources (NCBI, GBIF, IRMNG) to improve their taxonomy, so that when we import a new version, we get the changes.
  2. Import a phylogeny that better represents the relationships (phylogeny always takes priority over taxonomy in our system). If you have access to a file containing the tree(s) from Rochefort, you can upload them at https://tree.opentreeoflife.org/curator. You can import any tree that has a reference available, it doesn't need to be a peer-reviewed paper (thesis ok, although the link you provided to the Rochefort thesis gives a "Cannot process request: null" error).

Let us know if you have any questions about uploading trees. There is documentation here

jar398 commented 6 years ago

Thanks for this report. It's an excellent example of how automated taxonomy synthesis can go wrong.

Relevant: https://github.com/OpenTreeOfLife/reference-taxonomy/issues/336 - if this were fixed some of the unrecognized synonomies should be detected.

I tried the URL you gave for Rochefort 2015, and got the error "DigiTool Stream Gateway Error: Cannot process request: null"

I tried to figure out where NCBI gets its taxonomic information for this group. The best I could do was 'Marshall 2012', which I couldn't locate. (Found this reference at Arctos.) I did find Stephen Marshall's email address, and he might have ideas on how to track down the provenance of the disagreement between NCBI and GBIF. Rochefort's dissertation would probably also help.

GBIF is also a taxonomy aggregator, so if problems exist there ( https://www.gbif.org/species/9503 ), it would help them if an issue were filed at https://github.com/gbif/checklistbank/issues .

Chris-Angell commented 6 years ago

Thanks for your responses! Here are what I believe should be persistent links to Sabrina Rochefort's thesis: https://oatd.org/oatd/record?record=oai%5C%3Adigitool.library.mcgill.ca%5C%3A139086 http://digitool.library.mcgill.ca/thesisfile139086.pdf

There are five trees published in her thesis: Two of 30 equally parsimonious morphological trees (which together form the basis of her proposed taxonomic revisions), one morphological consensus tree (with little resolution), and two COI trees using a subset of exemplar species. I do not have tree files of these.

I am not associated with Ms. Rochefort, though I have spoken to her over email. After graduating, I believe she left academia, though she still expressed interest to me in eventually publishing her research. I might be able to get tree files from her, although my previous experience was that she doesn't check her academic email frequently.

Is there a way to incorporate her phylogenies without her files? I certainly have the ability to manually write out the Newick code for the five trees (without branch lengths), and I would be willing to do so. Piophilid systematics is becoming something of an obsessive hobby of mine, peripheral to my own research... However, I'd rather not have to do that if it isn't necessary.

I will also put a note on GBIF, as they have the same double & triplication issue as here.

Chris-Angell commented 6 years ago

Also "Marshall 2012" presumably refers to his book Flies: The Natural History and Diversity of Diptera. I don't have a copy of my own, but I believe its coverage of piophilids is sparse. Interesting that it would be cited as a source for their taxonomy.

Looking around at various taxonomic databases on the web, it seems that most of them are pretty wonky in their treatment of Piophilidae. NCBI is probably the most internally-consistent one in terms of genera, but it's missing a bunch of species, and is the source of all the unidentified nodes in the tree (i.e. "Parapiophila sp. BOLD:AAG1786"). GBIF is full of duplicate taxa and misspellings (I've reported the issue and they're looking into it). IRMNG and the Catalogue of Life each have a strange hybrid taxonomy going on where many species are lumped into Piophila, though some are (seemingly haphazardly) left out.

I don't know if any of that is useful to you in sorting this out, but I thought I would summarize it for anybody interested.