Closed lisawemo closed 8 years ago
Just glancing at this... there are a number of potential issues.
plot_heatmap
command.sample_data(xxx)
replacing xxx
with the name of your phyloseq object? Also try sample_variables(xxx)
to see what mapping data you actually have.Hi Jeff, Thank you for your response! Like I had mentioned before I have never used R and this phyloseq package. I did not build a phyloseq object. I tried:
phyloseq(otu2) otu2 is my merged otu_table.biom and my "mapping file". Example of how my mapping file looks: SampleID Location Paired 1_L Lung 1 1_M Mouth 1 2_L Lung 2 2_M Mouth 2
Error: Error in phyloseq(otu2) : Problem with OTU/taxa indices among those you provided. Check using intersect() and taxa_names()
I do not have a SampleType column. Do I just need to replace Location with SampleType?
Here is my whole code:
library("phyloseq") packageVersion("phyloseq")
otu=import_biom('C:\Users\closed-ref-33031010\otu_table.biom')
map=read.csv('C:\Users\Qiime Maps\Mouth vs lung R map_adj.csv',header=T,row.name=1,stringsAsFactors=F)
sampledata1=sample_data(map)
otu2=merge_phyloseq(otu,sampledata1) plot_heatmap(otu2)
I ran:
sample_data(otu2) and the full "mapping" file was there with, part of it: Sample Data: [67 samples by 3 sample variables]: Lung Description Paired 2117-S Lung extra 33 2195-M4 Mouth intra 19 1053-S2 Lung extra 32
sample_variables(otu2) [1] "Lung" "Description" "Paired"
To analyze your own data using Phyloseq (the R package), you need to combine all of your data into a phyloseq
object. This phyloseq
object is really a single object that Phyloseq (the R package) needs to run the analysis. To learn how to create a phyloseq
object, type:
library("phyloseq")
?phyloseq
You'll also find to get the most out of Phyloseq (the R package), you'll need additional data such as taxonomic assignment and a tree. Those aren't required, but you're limited without them.
Lastly, before you get too far with your own data, check out the terrific Phyloseq tutorials page using the GlobalPatterns
dataset that is included. You'll learn more about how to use Phyloseq by using a properly created phyloseq
object. For general R related questions, Stack Overflow is usually the best place to check first.
Good luck!
Hi Jeff,
I went to ?phyloseq page and it looks like I've done a merge_phyloseq function already with just my OTU table and mapping file. And I've already been to the heatmap Phyloseq tutorial page, but I'm just wondering how to get the SampleType.
Thanks again for your help!
SampleType
isn't something that exists in your data, because it is simply a name for some data that exists in the example data. For example, when running...
data("GlobalPatterns")
sample_variables(GlobalPatterns)
# and your data
sample_variables(otu2)
...you can see that SampleType
exists as metadata in the GlobalPatterns
example, but you don't have that variable in your data. It looks like you have a variable called Description
, so try that and see what happens:
plot_heatmap(otu2, sample.label="Description")
Also, when you just try plot_heatmap(otu2)
, do you get a heatmap or an error?
I get a heatmap when I run plot_heatmap(otu2) and I got a heatmap by location (mouth and lung) running: plot_heatmap(otu2, sample.label="Lung") So thank you, I'm getting closer! If I want to run it so mouth is on one side and lung is on the either or if I want to run it with the paired data (mouth and lung from the same patient), is there code for that? THANK YOU!
You can order by using sample.order="xyz"
within the plot_heatmap()
function. Check out ?plot_heatmap
for more ways to customize the plots.
If you want to sort the order in a way that you can't easily do with sample.order
, you can create a vector the of the sample names, which in your case might be the "2117-S, 2195-M4, etc." data by the following:
orderedVector = c("2117-S","2195-M4",etc...)
plot_heatmap(otu2, sample.order = orderedVector)
Note that the names have to be exact, and if you have samples in your phyloseq object that aren't in your list, they will be grouped together under N/A
.
orderedVector = c("2117-S","2195-M4",etc...) plot_heatmap(otu2, sample.order = orderedVector)
This worked, thank you so much!!
Hi Jeff and anyone else, I've been advised to create a heatmap looking at the differences directly for each pair on the log scale so I won't have so many columns. They'd like it to be black for no change and red for a lot of change. Can anyone help me out with how to do that? Thanks!
To be honest I think you're better off showing this with a scatter plot. It is a more quantitative graphical representation than color scale, and you'll have more options for grouping taxa, adjusting scale, etc.
See my DESeq2 example here:
http://joey711.github.io/phyloseq-extensions/DESeq2.html
The graphic at the bottom is one such custom ggplot2 example. You could modify as you like, but the basic skeleton is there.
I will close for now. I think this addresses the heart of your question, provided your paired test was executed correctly in the first place. You did not show any code for the test itself, unless I missed something.
Hi Joey, Thank you so much for getting back to me. Here is the code I ran using:
data=phyloseq_to_deseq2(otu2,~Paired+Lung)
results=DESeq(data,test="Wald",fitType="parametric") res=results(results,cooksCutoff=F)
alpha=0.05 results2=res[which(res$padj<alpha),] results2
And here is the outcome I got: log2 fold change (MAP): Lung Mouth vs Lung Wald test p-value: Lung Mouth vs Lung DataFrame with 20 rows and 6 columns baseMean log2FoldChange lfcSE stat pvalue padj
Hi, I am analyzing 16 microbiome data from the lung and mouth and I'm basically teaching myself qiime and R. So I've finally made it through qiime and have uploaded two files into R. My OTU table using:
and my sort of "map" file which just contains SampleID, Location, and Paired. For example: 1_L Lung 1 1_M Mouth 1 2_L Lung 2 2_M Mouth 2
My data is paired and I want to compare the lung vs the mouth using a heatmap. I've been having trouble with R. First off it won't let me do it by location. I've tried:
I went to ?sample_data and I have no idea what I am missing here.
Secondly, I'd like to do it by location and paired if that makes sense.
Can anyone help me with this code and maybe explain what it is I'm missing here, because I also have data where I look at the lung microbiome at baseline and then again 2 months later so that also is paired data.
Thank you for your help!