jinyizju / V.PhyloMaker2

This package (an updated version of 'V.PhyloMaker') can generate a phylogenetic tree for vascular plants based on three different botanical nomenclature systems.
16 stars 3 forks source link

"Taxonomic classification not consistent between sp.list and tree." #1

Open jcmassante opened 2 years ago

jcmassante commented 2 years ago

Dear Yi Jin, Thank you for the very helpful V.PhyloMaker2 R package. I have used V.PhyloMaker and S.PhyloMaker as well and I really appreciate your effort. However, there is some issue with the new package regarding taxonomic classification. I Tried to build a phylogenetic tree with 203 tips by using the GBOTB.extended.LCVP, which uses the Leipzig Catalogue of vascular plants. But it turns out that after standardising the species names in my list, and then using it to build the phylogeny, I got the following message: "Taxonomic classification not consistent between sp.list and tree."

I found this message a bit weird because I followed all steps in order to have all species/genera/families names in my list standardised according to LCVP by using the following code:

`## 1. Taxonomic standardization using LCVP ########################################################################

Standardise plant species names

names_standardisation <- lcvp_search(splist = splist$species)

Organise new names

names_standardisation$newNames <- stringr::word(names_standardisation$Output.Taxon, 1,2, sep = " ")

names_standardisation$newGenus <- stringr::word(names_standardisation$newNames, 1, sep = " ")

2. Build phylogeny using V.Phylomaker2

########################################################################

sp.list <- names_standardisation[, c(9, 10, 7)]

names(sp.list) <- c("species", "genus", "family", "species.relative", "genus.relative")

Build the phylogeny

phylogeny <- phylo.maker(sp.list = sp.list, tree = GBOTB.extended.LCVP, nodes = nodes.info.1.LCVP, output.sp.list = FALSE, output.tree = FALSE, scenarios = "S3")$scenario.3 This code build the phylogeny and print that message above, as well as the identity of taxa not consistent between "sp.list" and phylogenetic tree as follows: [1] "Taxonomic classification not consistent between sp.list and tree." genus family_in_sp.list family_in_tree 12 Apeiba Sparrmanniaceae Malvaceae 20 Casearia Samydaceae Salicaceae 22 Ceiba Bombacaceae Malvaceae 26 Cochlospermum Cochlospermaceae Bixaceae 28 Cordia Cordiaceae Boraginaceae 51 Guazuma Byttneriaceae Malvaceae 53 Heisteria Erythropalaceae Olacaceae 74 Mouriri Memecylaceae Melastomataceae 103 Sterculia Sterculiaceae Malvaceae 109 Theobroma Byttneriaceae Malvaceae`

I checked LCVP through GBIF and the family names are indeed according to the second column in this example (family_in_sp.list). The column "family_in_tree" matches other databases such as TPL and WP. Do you think maybe there is a bug in the "phylo.maker" function or something else?

Best regards jcmassante

jinyizju commented 2 years ago

Dear Yi Jin, Thank you for the very helpful V.PhyloMaker2 R package. I have used V.PhyloMaker and S.PhyloMaker as well and I really appreciate your effort. However, there is some issue with the new package regarding taxonomic classification. I Tried to build a phylogenetic tree with 203 tips by using the GBOTB.extended.LCVP, which uses the Leipzig Catalogue of vascular plants. But it turns out that after standardising the species names in my list, and then using it to build the phylogeny, I got the following message: "Taxonomic classification not consistent between sp.list and tree."

I found this message a bit weird because I followed all steps in order to have all species/genera/families names in my list standardised according to LCVP by using the following code:

`## 1. Taxonomic standardization using LCVP ########################################################################

Standardise plant species names names_standardisation <- lcvp_search(splist = splist$species) #Organise new names names_standardisation$newNames <- stringr::word(names_standardisation$Output.Taxon, 1,2, sep = " ")

names_standardisation$newGenus <- stringr::word(names_standardisation$newNames, 1, sep = " ")

2. Build phylogeny using V.Phylomaker2

########################################################################

sp.list <- names_standardisation[, c(9, 10, 7)]

names(sp.list) <- c("species", "genus", "family", "species.relative", "genus.relative")

Build the phylogeny phylogeny <- phylo.maker(sp.list = sp.list, tree = GBOTB.extended.LCVP, nodes = nodes.info.1.LCVP, output.sp.list = FALSE, output.tree = FALSE, scenarios = "S3")$scenario.3 This code build the phylogeny and print that message above, as well as the identity of taxa not consistent between "sp.list" and phylogenetic tree as follows:[1] "Taxonomic classification not consistent between sp.list and tree." genus family_in_sp.list family_in_tree 12 Apeiba Sparrmanniaceae Malvaceae 20 Casearia Samydaceae Salicaceae 22 Ceiba Bombacaceae Malvaceae 26 Cochlospermum Cochlospermaceae Bixaceae 28 Cordia Cordiaceae Boraginaceae 51 Guazuma Byttneriaceae Malvaceae 53 Heisteria Erythropalaceae Olacaceae 74 Mouriri Memecylaceae Melastomataceae 103 Sterculia Sterculiaceae Malvaceae 109 Theobroma Byttneriaceae Malvaceae`

I checked LCVP through GBIF and the family names are indeed according to the second column in this example (family_in_sp.list). The column "family_in_tree" matches other databases such as TPL and WP. Do you think maybe there is a bug in the "phylo.maker" function or something else?

Best regards jcmassante

Dear jcmassante,

Thanks for using V.PhyloMaker2 and raising this good question. In V.PhyloMaker2, the botanical nomenclatures (i.e. the binomial name) are standardized according to three different sources (i.e. TPL, LCVP, WP), not taxonomy (i.e. the species-family relationships). The taxnomy of the three megatrees is based on the same standard (e.g. APG4 for angiosperms).

Shortly I will put a data file on my github homepage alongside the repository of V.PhyloMaker2. The file will help users to standardize the species-family relationships (e.g. APG4 for angiosperms) in their species lists to be consistent with the three megatrees in V.PhyloMaker2, if they want to.

Best,

Yi

adityabandla commented 1 year ago

@Yi I'm facing the same issue as jcmassante. Is there a link to the file you uploaded?

Regards, Adi

jinyizju commented 1 year ago

@yi I'm facing the same issue as jcmassante. Is there a link to the file you uploaded?

Regards, Adi

Hi, Adi

You can find the information in the likn below,

https://github.com/jinyizju/genus.family.relationship

Best,

Yi