joey711 / phyloseq

phyloseq is a set of classes, wrappers, and tools (in R) to make it easier to import, store, and analyze phylogenetic sequencing data; and to reproducibly share that data and analysis with others. See the phyloseq front page:
http://joey711.github.io/phyloseq/
577 stars 187 forks source link

Importing picrust output biom files in phyloseq #720

Open wasimbt opened 7 years ago

wasimbt commented 7 years ago

Dear Joey!

So far I have done all my microbiome analysis on phyloseq. For the last analysis, is there any way to import picrust created biom files into phyloseq?

Thanks in advance!

Wasim

CarlyMuletzWolz commented 6 years ago

Hi Wasim and Joey,

Any update on this? I'm trying to do the same.

erictleung commented 6 years ago

@CarlyRae @wasimbt Haven't done this myself but here's a comment roughly outlining how to do this. It's a rough overview but seems to do what you're asking for.

wasimbt commented 6 years ago

Hi CarlyRae,

Below is how i did,

downloaded biomformat package

x = read_biom("Categorize_by_function_level3.biom") Aj_pathway_major3_function <-import_biom (x)

This is normal error message ignore (no taxonomy added)

Error in colnames<-(*tmp*, value = c("ta1", "ta0")) :

length of 'dimnames' [2] not equal to array extent

In addition: There were 50 or more warnings (use warnings() to see the first 50)

Add taxonomy

otumat = as(biom_data(x), "matrix") OTU = otu_table(otumat, taxa_are_rows=TRUE) taxmat = as.matrix(observation_metadata(x), rownames.force=TRUE) TAX = tax_table(taxmat) TAX

Taxonomy Table: [328 taxa by 3 taxonomic ranks]:

KEGG_Pathways1

1,1,1-Trichloro-2,2-bis(4-chlorophenyl)ethane (DDT) degradation "Metabolism"

ABC transporters "Environmental Information Processing".....

Now import mapping file

Aj_map <-import_qiime_sample_data ('AJ_Mapping_file.txt')

Merge all three files

Aj_f_path3 = phyloseq(OTU, TAX, Aj_map)

Aj_f_path3

It should work for you too,

Cheers!

wasimbt commented 6 years ago

Hi,

When i checked my biom file summary, it read as

x

biom object. type: Gene table matrix_type: sparse 328 rows and 251 columns

It could be related to issue of biom table (version?) as you can see my biom table is sparse and your is dense. Also instead of OTU table you should be having genetable after picrust prediction?

If it does not help, you could send me your file and i will have a quick look.

Best!

On Fri, Nov 3, 2017 at 7:43 PM, Carly Muletz Wolz notifications@github.com wrote:

Finally getting around to doing this!

When I try to follow your steps, I get stuck at the beginning. Now, I don't have either the ko or pathway predictions ranked at a specific level so maybe that is part of the problem....

This is what I get

y <- read_biom("pathway_predictions.biom") Warning message: In strsplit(msg, "\n") : input string 1 is invalid in this locale

pathway <- import_biom (y) Error in dimnames(x) <- dn : length of 'dimnames' [2] not equal to array extent In addition: There were 50 or more warnings (use warnings() to see the first 50)

y biom object. type: OTU table matrix_type: dense 328 rows and 434 columns

x = read_biom("ko_predictions.biom") Warning message: In strsplit(msg, "\n") : input string 1 is invalid in this locale

x biom object. type: OTU table matrix_type: dense 6909 rows and 434 columns

y <- read_biom("pathway_predictions.biom") Warning message: In strsplit(msg, "\n") : input string 1 is invalid in this locale

y biom object. type: OTU table matrix_type: dense 328 rows and 434 columns

ko <-import_biom (x) Error in dimnames(x) <- dn : length of 'dimnames' [2] not equal to array extent In addition: There were 50 or more warnings (use warnings() to see the first 50)

pathway <- import_biom (y) Error in dimnames(x) <- dn : length of 'dimnames' [2] not equal to array extent In addition: There were 50 or more warnings (use warnings() to see the first 50)

Neither the ko or pathway files result in any files coming into R. I have biomformat and phyloseq libraries loaded. But both x and y result in files...

Any thoughts?

Thanks!

Carly

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/joey711/phyloseq/issues/720#issuecomment-341792980, or mute the thread https://github.com/notifications/unsubscribe-auth/AFH0zpsVH6StNL6KJpRUBtg_BNbeYpUqks5sy17YgaJpZM4MPdVE .

-- Wasimuddin, PhD Institute of Evolutionary Ecology and Conservation Genomics University of Ulm Albert-Einstein Allee 11 D-89069 Ulm, Germany https://www.uni-ulm.de/en/nawi/bio3/prof-dr-simone-sommer/academic-staff-and-postdocs/dr-wasimuddin.html

CarlyMuletzWolz commented 6 years ago

Thanks for the reply and help with this. I was actually able to get it to work now. I was having issues with the import_biom call, but it appears that I did not need it.

Read in file created in picrust

y <- read_biom("pathway_predictions.biom") y

Read in my mapping file

map <- import_qiime_sample_data("Mapping_Ch3_final_selDays.txt")

Add taxonomy to y

otumaty = as(biom_data(y), "matrix") OTUy = otu_table(otumaty, taxa_are_rows=TRUE) taxmaty = as.matrix(observation_metadata(y), rownames.force=TRUE) TAXy = tax_table(taxmaty) TAXy

Merge all three files

path_1 = phyloseq(OTUy, TAXy, map) path_1

Following Bletz et al. 2016: Pathways with < 10 counts were removed from the table.

path_2 = filter_taxa(path_1, function(x) sum(x) > 10, TRUE)

Let's rarefy and see if that helps the issues with beta

Yes, need to rarefy for this, otherwise beta differences are driven by differences in sampling depth

min_lib <- min(sample_sums(path_2)) set.seed(4) path_2 <- rarefy_even_depth(path_2, sample.size = min_lib, verbose = T, replace = TRUE)

Thanks!!

wasimbt commented 6 years ago

Glad that it worked for you. Yes, I also prefer rarefying the data for diversity estimation. All the best for your analysis!

Cheers! Wasim

On Mon, Nov 6, 2017 at 3:13 PM, Carly Muletz Wolz notifications@github.com wrote:

Thanks for the reply and help with this. I was actually able to get it to work now. I was having issues with the import_biom call, but it appears that I did not need it.

y <- read_biom("pathway_predictions.biom") y

map <- import_qiime_sample_data("Mapping_Ch3_final_selDays.txt") Add taxonomy to y

otumaty = as(biom_data(y), "matrix") OTUy = otu_table(otumaty, taxa_are_rows=TRUE) taxmaty = as.matrix(observation_metadata(y), rownames.force=TRUE) TAXy = tax_table(taxmaty) TAXy Merge all three files

path_1 = phyloseq(OTUy, TAXy, map) path_1 Following Bletz et al. 2016: Pathways with < 10 counts were removed from the table.

path_2 = filter_taxa(path_1, function(x) sum(x) > 10, TRUE) Let's rarefy and see if that helps the issues with beta Yes, need to rarefy for this, otherwise beta differences are driven by differences in sampling depth

min_lib <- min(sample_sums(path_2)) set.seed(4) path_2 <- rarefy_even_depth(path_2, sample.size = min_lib, verbose = T, replace = TRUE)

Thanks!!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/joey711/phyloseq/issues/720#issuecomment-342160204, or mute the thread https://github.com/notifications/unsubscribe-auth/AFH0zpOIS-__rYi9Elt5IJtln85z7PYWks5szxQcgaJpZM4MPdVE .

-- Wasimuddin, PhD Institute of Evolutionary Ecology and Conservation Genomics University of Ulm Albert-Einstein Allee 11 D-89069 Ulm, Germany https://www.uni-ulm.de/en/nawi/bio3/prof-dr-simone-sommer/academic-staff-and-postdocs/dr-wasimuddin.html