Big-Bee-Network / Bee-Specialization-Modeling

Leveraging Large Biological Interaction Data to Quantify Plant Specialization by Bees
0 stars 1 forks source link

bee - plant phylogeny #3

Open seltmann opened 2 months ago

seltmann commented 2 months ago

Hi @cmsmith91 I am working on the response to reviewer regarding the random assignment of genera within families. For this, I wanted to retrieve the names of those genera that were randomly assigned and view the final tree to see the placement. The starting tree, 12862_2013_2375_MOESM1_ESM.txt, is at species and this is being compared to a list of genera from GloBI.

https://github.com/Big-Bee-Network/Bee-Specialization-Modeling/blob/master/scripts/make%20bee%20phylogeny.R

The portion of the script that seems to remove bee genera from the tree not in GloBI is below and there does not seem to be a transformation/ability to match generic names from GloBI with species names in tree. Any advice? Is this the most current script or did you start with an edited tree?

#remove bee genera that are not in globi
drop_me = bee_tree$tip.label[!bee_tree$tip.label %in% globi_genera]
pruned_tree = drop.tip(workingtree, drop_me)

Thanks!

seltmann commented 2 months ago

@cmsmith91 I think I figured it out! Its the wrong starting tree. Should be 12862_2013_2375_MOESM3_ESM.txt

but when I ran it using this starting tree it returned 10 (not 5 missing genera) so something is still incorrect here. The genera are:

"1" "Ancylandrena"
"2" "Mesoxaea"
"3" "Gaesischia"
"4" "Simanthedon"
"5" "Syntrichalonia"
"6" "Brachymelecta"
"7" "Cemolobus"
"8" "Micralictoides"
"9" "Pseudaugochlora"
"10" "Lithurgopsis"
seltmann commented 2 months ago

I am not finding a lot of support for random including taxa in the phylogenetic trees. The scenario where the branch is inserted in the middle of the genus (scenario 3). Scenario 3 is the default. Any specific reasons for choosing scenario 2 to start with?

From Qian & Jin 2016: when comparing the three Scenarios said that Scenarios 1 and 3 resulted in scores of phylogenetic metrics that are strongly correlated with one another and with those derived from the phylogeny resolved at the species level. When taking Scenario 2 (i.e. adding genera or species randomly within their families or genera) the resulting phylogeny performed less well, compared with phylogenies generated based on the other two scenarios.

From supplementary file describing scenarios from U.PhyloMaker:

“The phylo.maker function makes phylogenetic hypotheses under three scenarios (i.e. scenarios 1–3), which are the same three scenarios (approaches) as in S.PhyloMaker (Qian and Jin 2016). Specifically, in scenario 1, a new tip is binded to genus- or family-level basal node; in scenario 2, the new tip is binded to a randomly selected node at and below the genus- or family-level basal node. In scenario 3, the tip for a new genus is binded to the 1/2 point of the family branch (the branch between the family root node and basal node), unless the family branch length is longer than 2/3 of the whole family branch (from the family root node to the tip) length, the tip of a new genus will be binded to the upper 1/3 point of the whole family branch length. Otherwise, the new tip of an existing genus is binded to the basal node of that genus. In scenarios 2 and 3, if a family has only one tip in the mega-tree, the branch of the tip is evenly divided into the family-, genus- and species-level sections at the 1/3 and 2/3 points of the branch. The tip of a new genus is binded to the upper 1/3 point of the branch. If a genus has only one tip in the mega-tree, the branch of the tip is evenly divided into the genus- and species-level sections at the 1/2 point of the branch. In this case, the tip of a new species is binded to the 1/2 point of the branch. All the three scenarios have been used in the literature; Qian and Jin (2016) conducted an analysis comparing the three scenarios (see Qian and Jin 2016 for details).”