spholmes / F1000_workflow

43 stars 33 forks source link

NMDS plot for 16s amplicon #32

Open bioinfonext opened 5 years ago

bioinfonext commented 5 years ago

I am trying to plot NMDS plot from microbiome data in phyloseq and I used below command, It is working perfectly, but here I used raw otu count for NMDS plot and the rank-threshold transformation is recommended as input FOR NMDS ordinate command.

Mymensingh.NMDS.ORDINATE <- ordinate(Mymensingh_all, "NMDS", "bray")

plot_ordinattion(Mymensingh_all, ordination = Mymensingh.NMDS.ORDINATE, color="Region") + theme_minimal() + geom_point(size=5)

I have tried to do rank-threshold transformation as mention in the below publication but I am getting error, I will be thankful for your time and help.

https://f1000research.com/articles/5-1492

Error

ntaxa(mymensingh_all) [1] 44481

abund <- otu_table(mymensingh_all)

abund_ranks <- t(apply(abund, 1, rank)) abund_ranks <- abund_ranks - 20000 abund_ranks[abund_ranks < 1] <- 1

prev.ordNMDS_rank <- ordinate(abund_ranks, "NMDS", "bray")

Error in ordinate(abund_ranks, "NMDS", "bray") : Expected a phyloseq object or otu_table object.

spholmes commented 5 years ago

abund_ranks in now an ordinary matrix on which you can you apply a standard NMDS or cMDS protocol without going through the phyloseq wrapper for ordination. Bray Curtis distances do not make sense on ranks, a plain l1 distance or l2 distance could work (see Chapter 9 of the MSMB book for details).

On Thu, May 30, 2019, 11:07 yogeshgupt notifications@github.com wrote:

I am trying to plot NMDS plot from microbiome data in phyloseq and I used below command, It is working perfectly, but here I used raw otu count for NMDS plot and the rank-threshold transformation is recommended as input FOR NMDS ordinate command.

Mymensingh.NMDS.ORDINATE <- ordinate(Mymensingh_all, "NMDS", "bray")

plot_ordinattion(Mymensingh_all, ordination = Mymensingh.NMDS.ORDINATE, color="Region") + theme_minimal() + geom_point(size=5)

I have tried to do rank-threshold transformation as mention in the below publication but I am getting error, I will be thankful for your time and help.

https://f1000research.com/articles/5-1492

Error

ntaxa(mymensingh_all) [1] 44481

abund <- otu_table(mymensingh_all)

abund_ranks <- t(apply(abund, 1, rank)) abund_ranks <- abund_ranks - 20000 abund_ranks[abund_ranks < 1] <- 1

prev.ordNMDS_rank <- ordinate(abund_ranks, "NMDS", "bray")

Error in ordinate(abund_ranks, "NMDS", "bray") : Expected a phyloseq object or otu_table object.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/spholmes/F1000_workflow/issues/32?email_source=notifications&email_token=AAJFZPKFKFFYS7W2TNMJPTLPYAJULA5CNFSM4HRICMZ2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4GWZDC5Q, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJFZPPOWAKCROGCRT2A46LPYAJULANCNFSM4HRICMZQ .

bioinfonext commented 5 years ago

abund_ranks in now an ordinary matrix on which you can you apply a standard NMDS or cMDS protocol without going through the phyloseq wrapper for ordination. Bray Curtis distances do not make sense on ranks, a plain l1 distance or l2 distance could work (see Chapter 9 of the MSMB book for details). On Thu, May 30, 2019, 11:07 yogeshgupt @.***> wrote: I am trying to plot NMDS plot from microbiome data in phyloseq and I used below command, It is working perfectly, but here I used raw otu count for NMDS plot and the rank-threshold transformation is recommended as input FOR NMDS ordinate command. Mymensingh.NMDS.ORDINATE <- ordinate(Mymensingh_all, "NMDS", "bray") plot_ordinattion(Mymensingh_all, ordination = Mymensingh.NMDS.ORDINATE, color="Region") + theme_minimal() + geom_point(size=5) I have tried to do rank-threshold transformation as mention in the below publication but I am getting error, I will be thankful for your time and help. https://f1000research.com/articles/5-1492 Error ntaxa(mymensingh_all) [1] 44481 abund <- otu_table(mymensingh_all) abund_ranks <- t(apply(abund, 1, rank)) abund_ranks <- abund_ranks - 20000 abund_ranks[abund_ranks < 1] <- 1 prev.ordNMDS_rank <- ordinate(abund_ranks, "NMDS", "bray") Error in ordinate(abund_ranks, "NMDS", "bray") : Expected a phyloseq object or otu_table object. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#32?email_source=notifications&email_token=AAJFZPKFKFFYS7W2TNMJPTLPYAJULA5CNFSM4HRICMZ2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4GWZDC5Q>, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJFZPPOWAKCROGCRT2A46LPYAJULANCNFSM4HRICMZQ .

Dear Prof. Susan,

Thanks a lot for your time and response: I tried to understand it but I am confused, it will be a great help if you could share the further code.

alternatively, I am thinking to use this normalize method from vegan; abund_table_norm <- decostand(abund_table, "normalize")

.abund_table_norm.NMDS.ORDINATE <- ordinate(abund_table_norm, "NMDS", "bray")

plot_ordinattion(abund_table_norm, ordination=,abund_table_norm.NMDS.ORDINATE color="Region") + theme_minimal() + geom_point(size=5)
Thanks Yogesh

krisrs1128 commented 5 years ago

Hi Yogesh,

I think the point is that abund_ranks should be a matrix type (you can check this using str(abund_ranks). In this case, you can get an ordination directly from base R, without having to go through phyloseq's ordinate function. Specifically, you can probably do something like

D <- dist(abund_ranks, method = 'manhattan')
coords <- cmdscale(D)
plot(coords)

Here, D computes a distance object, giving the l1 distance between samples, and the second line extracts coordinates from the multidimensional scaling. You can of course turn coords into a data.frame with additional attributes (like "Region"), and then plot it with ggplot2, if you'd like.

Best Wishes, Kris

bioinfonext commented 5 years ago

Thanks a lot Kris, It is a great help. Now it clear to me. I will try this.

Could you please also confirm me how to decide the correct taxa number in abund_ranks command, I do have total 44481 taxa and I just use randomly 20000, is there any way to find the exact value here or just any number we can put?

abund_ranks <- abund_ranks - 20000

ntaxa(mymensingh_all) [1] 44481

krisrs1128 commented 5 years ago

Hi Yogesh,

There is no "automatic" answer for this (it's probably better to build intuition about the influence of different choices, anyways), but you can refer to the last paragraph here for some discussion: https://github.com/spholmes/F1000_workflow/issues/21

Best Wishes, Kris