joey711 / phyloseq

phyloseq is a set of classes, wrappers, and tools (in R) to make it easier to import, store, and analyze phylogenetic sequencing data; and to reproducibly share that data and analysis with others. See the phyloseq front page:
http://joey711.github.io/phyloseq/
582 stars 187 forks source link

Import pre-computed distance matrix #345

Closed nsarode closed 10 years ago

nsarode commented 10 years ago

I am pretty new to R, but thanks to Phyloseq, atleast I don't dread it anymore :)

I am learning Phyloseq using output files generated by QIIME. While working with the ordinate plots section, I saw text mentioning that Unifrac dist measurements (esp. weighted) takes quite a while. But from what I know, QIIME does provide precomputed weighted distance matrices (Unifrac dist weighted matrix as well as Principal coordinate matrix). Is there any way to import these precomputed files directly instead of computing them ? I tried reading these files as matrix, as table, but get errors. I'll be happy to send example file to you if you wish, but am pasting snippets of the file below.

QIIME generated Distance matrix (weighted_unifrac) screen shot 2014-05-22 at 12 10 53 pm QIIME generated Principal coordinate matrix (weighted_unifrac) screen shot 2014-05-22 at 12 12 20 pm

Thanks !

joey711 commented 10 years ago

Yes, there are lots of ways to import tab-delimited tables of values into R. Some very quick googling will give you lots of options. In your case I suggest read.table, and you may need to fuss with a few of the default parameters to let it know that your file does have both row and column headers, etc.

Once you have imported the file, you will need to coerce it to a matrix if it isn't already, and then to a "dist" (distance matrix) object. From your photo of a portion of the matrix, it appears that you have it stored as a full symmetric pairwise square. Here is the line of code for converting the table/data.frame to a standard matrix, and then to a "dist" object.

# assume `x` is your imported table.
x1 <- as.dist(as(x, "matrix"))

Now, if you did the above and it worked, class(x1) will return "dist" in your R session. Furthermore, you can no provide x1 as the dist argument to relevant phyloseq functions, like ordinate.

Performing MDS (referred to as PCoA by microbiome folks) is very fast, so I don't think there is much point in storing or importing the QIIME-calculated object. Use phyloseq::ordinate instead, and avail yourself of some of the other options and supported ordination methods (MDS is not the only game in town).

Finally, the premise of your question seems to imply that calculating weighted-UniFrac is implausible in phyloseq. Actually, I've implemented an optionally-parallelized version of fastUniFrac for both weighted- and unweighted-UniFrac. Furthermore, the QIIME-calculated UniFrac distances will, at least by default, be based on rarefied counts, which you generally should not use. Ever. In phyloseq you can perform admissible transformations of your counts prior to your choice of distance calculation, and this is what I recommend.

Hope that helps. Thanks for the feedback and interest in phyloseq,

joey

p.s. be sure to check out the extensive documentation, tutorials, and other closed issues related to the distance, ordinate, and transform_sample_counts functions. What you need here is already pretty well documented. Try ?phyloseq::distance for instance, from an R session, if you have phyloseq installed.

nsarode commented 10 years ago

Thank you Joey for the help and recommendations ! I did use the read.table function, didn't know about the as.dist one :( I will give your suggested method a shot (instead of trying to read QIIME generated matrix). Hopefully I don't face any other issue.

mweand commented 3 years ago

Hi I tried the code above for importing a distance matrix and ordinating, and it didn't work. My data frame has row and column names and can be converted to the "dist" class. However when I try this: ordinate(x1, "PCoA", "unifrac", weighted=FALSE)

I get this error: Error in ordinate(x1, "PCoA", "unifrac", weighted = FALSE) : Expected a phyloseq object or otu_table object.

Any help is appreciated!

cresil commented 3 years ago

Hi,

You need to use the ape::pcoa function rather than the phyloseq::ordinate function. Phyloseq::ordinate does both distance calculation (unifrac) and ordination (pcoa).

You already have a distance matrix.

On Mon, Aug 23, 2021, 22:34 mweand @.***> wrote:

Hi I tried the code above for importing a distance matrix and ordinating, and it didn't work. My data frame has row and column names and can be converted to the "dist" class. However when I try this: ordinate(x1, "PCoA", "unifrac", weighted=FALSE)

I get this error: Error in ordinate(x1, "PCoA", "unifrac", weighted = FALSE) : Expected a phyloseq object or otu_table object.

Any help is appreciated!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/joey711/phyloseq/issues/345#issuecomment-904104075, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE4JZDWWRFB4FOOCZBFXCXDT6KWELANCNFSM4APVUSXQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .

mweand commented 3 years ago

Hi Thanks for the rapid response! I will give that a shot!


From: cresil @.> Sent: Tuesday, August 24, 2021 2:48 AM To: joey711/phyloseq @.> Cc: Matthew Weand @.>; Comment @.> Subject: [EXTERNAL] Re: [joey711/phyloseq] Import pre-computed distance matrix (#345)

Hi,

You need to use the ape::pcoa function rather than the phyloseq::ordinate function. Phyloseq::ordinate does both distance calculation (unifrac) and ordination (pcoa).

You already have a distance matrix.

On Mon, Aug 23, 2021, 22:34 mweand @.***> wrote:

Hi I tried the code above for importing a distance matrix and ordinating, and it didn't work. My data frame has row and column names and can be converted to the "dist" class. However when I try this: ordinate(x1, "PCoA", "unifrac", weighted=FALSE)

I get this error: Error in ordinate(x1, "PCoA", "unifrac", weighted = FALSE) : Expected a phyloseq object or otu_table object.

Any help is appreciated!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/joey711/phyloseq/issues/345#issuecomment-904104075, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE4JZDWWRFB4FOOCZBFXCXDT6KWELANCNFSM4APVUSXQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/joey711/phyloseq/issues/345#issuecomment-904369373, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AVKHK72MVBJVVHWD62ETFHTT6M6CNANCNFSM4APVUSXQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email.

CAUTION: This email originated from outside of KSU. Use caution replying or supplying information, clicking links or opening attachments. If you suspect the message is fraudulent, contact the UITS Service Desk at 470-578-6999 or @.**@.>.

ronenliberman commented 3 years ago

Hi,

I followed this thread and tried to continue from where it finished. I have Unifrac dist. file. I imported it into R convereted it to dist using : df = read.table("..") x1 <- as.dist(as(df, "matrix"))

then i continiued to t_pcoa=ape::pcoa(x1)

So far so good, however, i cant plot it using "plot_ordination" because it requires a phyloseq-class file

is there a way to convert the pcoa class into a phyloseq class?

Thanks