joey711 / phyloseq

phyloseq is a set of classes, wrappers, and tools (in R) to make it easier to import, store, and analyze phylogenetic sequencing data; and to reproducibly share that data and analysis with others. See the phyloseq front page:
http://joey711.github.io/phyloseq/
582 stars 187 forks source link

phyloseq update causing error in reading original phyloseq objects? #432

Closed barbara1 closed 9 years ago

barbara1 commented 9 years ago

Hi Joey, I started using phyloseq last November and am so glad to have the package for managing my microbial sequence data. Recently when I downloaded another package with various dependencies, I was asked if I also wanted to update my version of phyloseq. And I did. Then a few weeks later I went back to continue with some analyses in phyloseq only to find that all my phlyoseq objects would not read and always returned this error:

Error in phyloseq(DT_litter) : Problem with OTU/taxa indices among those you provided. Check using intersect() and taxa_names()

I though perhaps it had something to do with the update so I went back to my .csv files to recreate the object, but had the same error. I looked at the tax_names between the tables and everything looked fine. These were the same files I used for my original build of a phyloseq object in Nov-Dec. I decided to return to a small test set and run everything again (again a test set I created for learning phyloseq import before christmas). Still the same result. I must be missing something – or incorrectly assigning rownames/colnames --or have some other setting/dependencies/package that is causing some type of conflict? Below I have pasted the contents of the console for the test files – ending with the sessionInfo() . Hopefully it is something simple that I am just overlooking. Unfortunately I do not know what original phyloseq version I installed but I would have installed it sometime in November. Please let me know if there is any other information I can provide to help in solving the issues. I am running R in RStudio Ver. 0.98.1091

Thanks in advance!

Barbara

Console output creation of phyloseq object:

OTU_test2<-read.csv2("Test_Domtree_Dataset//otu_tableraw_test10_12.csv", header=T) TAXA_test2<-read.csv2("Test_Domtree_Dataset//otu_ID10_test.csv", header=T) SAMPLE_test2<-read.csv2("Test_Domtree_Dataset//sample_data_test.csv", header=T) OTU_test2 <-as.matrix(OTU_test2[,2:13]) TAXA_test2 = as.matrix(TAXA_test2) OTUtable_test2= otu_table(OTU_test2, taxa_are_rows = TRUE) TAXtable_test2 = tax_table(TAXA_test2) SAMPLEdata_test2 = sample_data(SAMPLE_test2) rownames(TAXtable_test2)->rownames(OTUtable_test2)

when you convert to sample data,default rowname assignments is sa1, sa2 etc. If you want to keep your original sampleID labels (as they are in the columns of the OTUtable)

rownames(SAMPLEdata_test2)<-colnames(OTUtable_test2) OTUtable_test2 OTU Table: [10 taxa and 12 samples] taxa are rows L101 L103 L105 L111 L113 L115 S102 S104 S106 S112 S114 S116 sp1 14 0 1 175 18 5 123 36 6 111 12 24 sp2 0 95 2 1 20 112 0 2 0 0 0 0 sp3 0 230 0 10 2 0 0 2 4 1 58 4 sp4 0 1 0 0 0 0 2 0 0 0 1 0 sp5 68 40 25 388 20 12 1300 956 401 325 26 609 sp6 0 40 1 6 0 3 0 1 12 4 75 2 sp7 0 8 0 1 0 2 0 8 0 2 129 0 sp8 0 161 0 6 63 4 0 4 0 12 578 0 sp9 1683 10 22 14 1233 5 31 0 3 6 15 3 sp10 2 0 0 11 0 0 13 3 6 2 1 15 TAXtable_test2 Taxonomy Table: [10 taxa by 8 taxonomic ranks]: OTUID Domain Phylum Class Order Family Genus
sp1 "otu_1" "Fungi" "Basidiomycota" "Agaricomycetes" "Russulales" "Mycenaceae" "Mycena"
sp2 "otu_2" "Fungi" "Ascomycota" "Dothideomycetes" "Agaricales" "Microthyriaceae" "Tothia"
sp3 "otu_3" "Fungi" "Basidiomycota" "Agaricomycetes" "Microthyriales" "Russulaceae" "Russula"
sp4 "otu_4" "Fungi" "Ascomycota" "Eurotiomycetes" "Russulales" "Trichocomaceae" "Penicillium"
sp5 "otu_5" "Fungi" "Ascomycota" "Leotiomycetes" "Eurotiales" "Dermateaceae" "Pezicula"
sp6 "otu_6" "Fungi" "Ascomycota" "Sordariomycetes" "Helotiales" "Amphisphaeriaceae" "Adisciso"
sp7 "otu_7" "Fungi" "Basidiomycota" "Agaricomycetes" "Xylariales" "Marasmiaceae" "Rhodocollybia"
sp8 "otu_8" "Fungi" "Ascomycota" "Eurotiomycetes" "Agaricales" "Herpotrichiellaceae" "Cladophialophora" sp9 "otu_9" "Fungi" "Ascomycota" "Saccharomycetes" "Chaetothyriales" "Pichiaceae" "Pichia"
sp10 "otu_10" "Fungi" "Ascomycota" "Dothideomycetes" "Saccharomycetales" "Microthyriaceae" "Tothia"
Species sp1 "otu_1" sp2 "otu_2" sp3 "otu_3" sp4 "otu_4" sp5 "otu_5" sp6 "otu_6" sp7 "otu_7" sp8 "otu_8" sp9 "otu_9" sp10 "otu_10" rownames(SAMPLE_test2)<-colnames(OTUtable_test2) SAMPLEdata_test2 Sample Data: [12 samples by 6 sample variables]: SampleID Replicated SiteID Season Stand Horizon L101 L101 AA A1 Fall1 Spruce L L103 L103 CC B2 Fall1 Beech L L105 L105 EE C3 Fall1 Oak L L111 L111 AA A1 Winter Spruce L L113 L113 CC B2 Winter Beech L L115 L115 EE C3 Winter Oak L S102 S102 BB A1 Fall1 Spruce S S104 S104 DD B2 Fall1 Beech S S106 S106 FF C3 Fall1 Oak S S112 S112 BB A1 Winter Spruce S S114 S114 DD B2 Winter Beech S S116 S116 FF C3 Winter Oak S DT_test2<-phyloseq(OTUtable_test2, TAXtable_test2, SAMPLEdata_test2) phyloseq(DT_test2) Error in phyloseq(DT_test2) : Problem with OTU/taxa indices among those you provided. Check using intersect() and taxa_names() sessionInfo() R version 3.1.2 (2014-10-31) Platform: x86_64-w64-mingw32/x64 (64-bit)

locale: [1] LC_COLLATE=English_Canada.1252 LC_CTYPE=English_Canada.1252 LC_MONETARY=English_Canada.1252 [4] LC_NUMERIC=C LC_TIME=English_Canada.1252

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] ggplot2_1.0.0 plyr_1.8.1 vegan_2.2-1 permute_0.8-3 lattice_0.20-29 phyloseq_1.10.0

loaded via a namespace (and not attached): [1] acepack_1.3-3.3 ade4_1.6-2 annotate_1.44.0 AnnotationDbi_1.28.1
[5] ape_3.2 base64enc_0.1-2 BatchJobs_1.5 BBmisc_1.8
[9] Biobase_2.26.0 BiocGenerics_0.12.1 BiocParallel_1.0.0 biom_0.3.12
[13] Biostrings_2.34.1 brew_1.0-6 checkmate_1.5.1 chron_2.3-45
[17] cluster_1.15.3 codetools_0.2-9 colorspace_1.2-4 data.table_1.9.4
[21] DBI_0.3.1 DESeq2_1.6.3 digest_0.6.8 fail_1.2
[25] foreach_1.4.2 foreign_0.8-61 Formula_1.1-2 genefilter_1.48.1
[29] geneplotter_1.44.0 GenomeInfoDb_1.2.4 GenomicRanges_1.18.4 grid_3.1.2
[33] gtable_0.1.2 Hmisc_3.14-6 igraph_0.7.1 IRanges_2.0.1
[37] iterators_1.0.7 latticeExtra_0.6-26 locfit_1.5-9.1 MASS_7.3-35
[41] Matrix_1.1-4 mgcv_1.8-3 multtest_2.22.0 munsell_0.4.2
[45] nlme_3.1-118 nnet_7.3-8 parallel_3.1.2 proto_0.3-10
[49] RColorBrewer_1.1-2 Rcpp_0.11.3 RcppArmadillo_0.4.600.0 reshape2_1.4.1
[53] RJSONIO_1.3-0 rpart_4.1-8 RSQLite_1.0.0 S4Vectors_0.4.0
[57] scales_0.2.4 sendmailR_1.2-1 splines_3.1.2 stats4_3.1.2
[61] stringr_0.6.2 survival_2.37-7 tools_3.1.2 XML_3.98-1.1
[65] xtable_1.7-4 XVector_0.6.0 zlibbioc_1.12.0

barbara1 commented 9 years ago

I have no idea why my comment line changed to an enormous font size :), sorry about that. its rather distracting.

joey711 commented 9 years ago

The phyloseq data classes did not change during this time. There is an updated release version of phyloseq from Bioconductor that should work fine on your data. Try again, and/or let me know if you already resolved this issue.

Unfortunately for data import issues it is difficult to debug because I don't have access to your data. In the future, if you're having a data import issue, you will probably get a faster response if you can also share the files somewhere so that someone can reproduce (or not) your error.

I'm not aware of any problems with phyloseq import functions at the moment, other than biom-format version 2 not yet being supported, but that should change very soon.

I will close for now, but please feel free to post again to state if/how you resolved this problem.

Cheers

joey