joey711 / phyloseq

phyloseq is a set of classes, wrappers, and tools (in R) to make it easier to import, store, and analyze phylogenetic sequencing data; and to reproducibly share that data and analysis with others. See the phyloseq front page:
http://joey711.github.io/phyloseq/
586 stars 187 forks source link

plot_ordination color and shape variables were not found #541

Open sentausa opened 9 years ago

sentausa commented 9 years ago

Dear Joey,

I am trying to make some ordination plots on my data, but I always encounter errors when I use type = "samples" using plot_ordination.

> mydata phyloseq-class experiment-level object otu_table() OTU Table: [ 9930 taxa and 260 samples ] sample_data() Sample Data: [ 260 samples by 1 sample variables ] tax_table() Taxonomy Table: [ 9930 taxa by 6 taxonomic ranks ] > colnames(sample_data(mydata)) [1] "Set" > class(sample_data(mydata)$Set) [1] "factor"

> my.PCoA <- ordinate(mydata, "PCoA", "bray") > plot_ordination(mydata, my.PCoA, type = "taxa", color = "phylum")

beautiful colorful plot appears

> plot_ordination(mydata, my.PCoA, type = "samples", color = "Set", shape = "Set") No available covariate data to map on the points for this plot 'type' Warning messages: 1: In plot_ordination(mydata, my.PCoA, type = "samples", color = "Set", : Color variable was not found in the available data you provided.No color mapped. 2: In plot_ordination(mydata, my.PCoA, type = "samples", color = "Set", : Shape variable was not found in the available data you provided.No shape mapped.

a plot appears but only in black

Do I have perhaps too many taxa and samples? I saw a similar issue #416, but it does not seem to have clear solutions for me.

So, could you please help me on this? Thanks very much in advance.

michberr commented 9 years ago

hmm this is strange. I don't think your error has to do with whether the type is "samples" or "taxa", but rather the variable you're passing to color and shape and the formatting of your sample_data. Btw, the default behavior or plot_ordination is to plot samples, so you could try removing that parameter from your command.

Did you import your sample data with a biom file? It seemed like from the previous post that importing it manually as a data frame and then using the merge_phyloseq command to add it to the phyloseq object would work. I also wonder if this problem is due to your sample_data only being 1 column. I don't have a lot of time right now to look through this, but maybe these tips will be useful.

sentausa commented 9 years ago

Thank you Michelle. I imported it manually from a tab-delimited OTU table resulting from our pipeline and added it using merge_phyloseq. I also suspected the problem might come from my sample_data being only 1 column. I'm trying to put more column(s) and we'll see if it'll work. Thanks again!

joey711 commented 9 years ago

@sentausa ,

@michberr is correct. you do not have any sample variables in your data object. Did you expect otherwise? Since you haven't provided a MRE, there's no further debugging we can help with here. Very unlikely this is a phyloseq problem. Please see the tutorials that explain "manually importing" sample data into phyloseq. There are lots of examples of this floating around online, and in the vignettes.

If you have a following issue that is not the same as the one posted here, please post a new issue.

Thanks for your feedback and interest in phyloseq. Good luck!

joey

sentausa commented 9 years ago

Yes, it turned out that after I added more columns in my sample_data, everything works just as expected. I thought that at least one sample variable was enough to produce an ordination plot :-p Thank you!

michberr commented 9 years ago

I would like to reopen this issue as it represents a true bug. plot_ordination fails to map variables from the sample_data when there is only one column.

Example with 1 column

data(GlobalPatterns) gp1 <- GlobalPatterns sample_data(gp1) <- sample_data(gp1)[ ,c("SampleType") ] my.PCoA1 <- ordinate(gp1, "PCoA", "bray") plot_ordination(gp2, my.PCoA1, type = "samples", color = "SampleType")

Output:

No available covariate data to map on the points for this plot type Warning message: In plot_ordination(gp1, my.PCoA1, type = "samples", color = "SampleType") : Color variable was not found in the available data you provided.No color mapped.

rplot1

Example with two columns

gp2 <- GlobalPatterns sample_data(gp2) <- sample_data(GlobalPatterns)[ ,c("SampleType","Description") ] my.PCoA2 <- ordinate(gp2, "PCoA", "bray") plot_ordination(gp2, my.PCoA2, type = "samples", color = "SampleType")

rplot2

joey711 commented 9 years ago

you make a good point :)

husenzhang commented 8 years ago

My temporary fix is to add a "dummy variable" in sample_data: sample_data(phyloseq_object)[ , 2] <- sample_data(phyloseq_object)[ ,1] #########

sessionInfo() R version 3.3.1 (2016-06-21) Platform: x86_64-apple-darwin13.4.0 (64-bit) Running under: OS X 10.10.5 (Yosemite)

other attached packages: [1] randomForest_4.6-12 phyloseq_1.16.2 vegan_2.4-0 lattice_0.20-33
[5] permute_0.9-0

adityabandla commented 7 years ago

Is there a fix for this already?

wschierd commented 6 years ago

So glad I found this bug report, it was driving me nuts!

The dummy variable thing fixed it.

wangzhichao1990 commented 5 years ago

ncol(sample_data) must be >1, it help me.

silvtal commented 3 years ago

So this bug was never fixed? 6 years later? I'm not that familiar with ggplot yet and I lost a lot of time thinking it was an error on my part...

biomaker52 commented 2 years ago

I have the same problem these days that driving me nuts.. then I finally find the solution, thank you.. To add a "dummy_variable" in sample_data, it works.

abdimicro commented 1 year ago

I never knew this bug existed until I used sample_data with one column. Thank you guys for suggesting adding an extra dummy column six years ago.

akshaymalwade commented 1 year ago

This is very helpful. Thank you all.

silvtal commented 7 months ago

So this bug was never fixed? 6 years later? I'm not that familiar with ggplot yet and I lost a lot of time thinking it was an error on my part...

I just lost half an hour because of this bug again.

mortuseon commented 4 weeks ago

I am experiencing this same issue, but my phyloseq object has 5 columns, so the dummy variable approach obviously does not work. Is there any workaround for this? I realise this is an old issue but it seems to be a common problem so I hope it can be readily resolved...