microbiome / miaViz

Microbiome Analysis Plotting and Visualization
https://microbiome.github.io/miaViz
Artistic License 2.0
10 stars 12 forks source link

plotRDA #87

Closed TuomasBorman closed 1 year ago

TuomasBorman commented 1 year ago

First version of plotRDA method. Any thoughts?

data(enterotype)
tse <- enterotype

tse <- tse[, complete.cases(colData(tse))]
tse <- runRDA(tse, data ~ Nationality + Gender, na.action = na.omit)

plotRDA(tse, dimred = "RDA", point_size = 4, colour_by = "Gender")
TuomasBorman commented 1 year ago

@RiboRings Sorry, I just realized that I accidentally overwrote your branch... --> If you have time, you could develop this further (update document and checks and make the code better...)

antagomir commented 1 year ago

Seems promising to me at least.

RiboRings commented 1 year ago

Ok I'm playing with it

RiboRings commented 1 year ago

I'm wondering why plotReducedDim only shows a limited number of samples, even when there is no missing data. And when I look into the reducedDim object all samples have coordinates and therefore they should also appear in the plot. For example:

library(mia)
library(miaViz)

data("Tengeler2020", package = "mia")
tse <- Tengeler2020

dim(tse)
[1] 151  27

tse <- transformAssay(tse, method = "relabundance")

any(is.na(tse$patient_status))
[1] FALSE
any(is.na(tse$cohort))
[1] FALSE

tse <- runRDA(tse,
              assay.type = "relabundance",
              formula = assay ~ patient_status + cohort,
              distance = "bray",
              na.action = na.exclude)

reducedDim(tse, "RDA")

          dbRDA1       dbRDA2      dbRDA3
A110 -0.17355472 -0.259042261 -0.14200871
A12  -0.17355472 -0.259042261 -0.14200871
A15  -0.17355472 -0.259042261 -0.14200871
A19  -0.17355472 -0.259042261 -0.14200871
A21  -0.28415415  0.134917760  0.04258584
A23  -0.28415415  0.134917760  0.04258584
A25  -0.28415415  0.134917760  0.04258584
A28  -0.28415415  0.134917760  0.04258584
A29  -0.28415415  0.134917760  0.04258584
A34  -0.01714403 -0.172125406  0.32197553
A36  -0.01714403 -0.172125406  0.32197553
A37  -0.01714403 -0.172125406  0.32197553
A39  -0.01714403 -0.172125406  0.32197553
A111  0.16195220 -0.084319200 -0.23198014
A13   0.16195220 -0.084319200 -0.23198014
A14   0.16195220 -0.084319200 -0.23198014
A16   0.16195220 -0.084319200 -0.23198014
A17   0.16195220 -0.084319200 -0.23198014
A18   0.16195220 -0.084319200 -0.23198014
A210  0.05135277  0.309640821 -0.04738559
A22   0.05135277  0.309640821 -0.04738559
A24   0.05135277  0.309640821 -0.04738559
A26   0.05135277  0.309640821 -0.04738559
A27   0.05135277  0.309640821 -0.04738559
A33   0.31836290  0.002597655  0.23200410
A35   0.31836290  0.002597655  0.23200410
A38   0.31836290  0.002597655  0.23200410

plotReducedDim(tse, "RDA")

Screenshot 2023-08-28 at 10 40 51

RiboRings commented 1 year ago

Apparently the reason is that most points are assigned identical coordinates and therefore they overlap in the plot. Is that normal?

TuomasBorman commented 1 year ago

Nice!!

I'm wondering why plotReducedDim only shows

RDA returns (rda_object$CCA) returns different site scores. Currently, runRDA returns (Weighted) orthonormal site scores (u), but I guess weighted sums (wa) is more more comparable with other ordination methods. --> we still have to discuss this, see https://github.com/microbiome/mia/pull/422

To overcome this issue while developing this, you can do

rda <- reducedDim(tse)

att <- attributes(rda)

rda <- att$rda$CCA$wa

attributes(rda) <- att

reducedDim(tse) <- rda
RiboRings commented 1 year ago

Thanks! Indeed the wa values seem more meaningful when it comes to plotting.

RiboRings commented 1 year ago

Should we make the colours of dots and ellipses consistent? RIght now it's a bit tricky to see which dots belong to which ellipse.

Screenshot 2023-08-28 at 11 44 42

TuomasBorman commented 1 year ago

Should we make the colours of dots and ellipses consistent? RIght now it's a bit tricky to see which dots belong to which ellipse.

Screenshot 2023-08-28 at 11 44 42

Yes, ggplot_build function or something can probably help to get those colors (take into account that sometimes there is no colours at all, colour_by is not specified)

antagomir commented 1 year ago

You could also look at microshades for accessible color palettes that were suggested for microbiome studies. Seems to support phyloseq but perhaps it is straigfwd to use this also with TreeSE?

RiboRings commented 1 year ago

I haven't managed yet to match dot and ellipse colours, because dot are using the scater scale which is manually defined. Maybe I'll try to use guides.

Anyway, I think that plot without filling looks better and already matches colours, so I would suggest setting ellipse_fill = FALSE by default.

Screenshot 2023-08-28 at 14 57 45

antagomir commented 1 year ago

You could also check inspiration from microViz see issue #89

TuomasBorman commented 1 year ago

Also https://ggplot2.tidyverse.org/reference/aes_eval.html, after_scale(color) might help

RiboRings commented 1 year ago

The colour thing is good to go now. Do you want me to add some optional arguments for these items?

# TODO: Ellipse: fill or just an edge? --> Edge line type?
    # TODO: vector: vector line type, vector arrow size and line bulkiness
    # TODO: vector labels: Adjust labels position, turn off ggrepel, text color, text size, text label?, parse off?
    # --> catch all geomrepel parameters (there will be a warning if we try to feed there parameter that doesbeong to plotReducedDim for example)
TuomasBorman commented 1 year ago

The colour thing is good to go now. Do you want me to add some optional arguments for these items?

# TODO: Ellipse: fill or just an edge? --> Edge line type?
    # TODO: vector: vector line type, vector arrow size and line bulkiness
    # TODO: vector labels: Adjust labels position, turn off ggrepel, text color, text size, text label?, parse off?
    # --> catch all geomrepel parameters (there will be a warning if we try to feed there parameter that doesbeong to plotReducedDim for example)

Cool, first line is already done. But for those others you could think which option is useful to have. I think at least those are something that the user might want to adjust but there is probably other things also.

RiboRings commented 1 year ago

--> turn off ggrepel Is there a way to turn off ggrepel itself, or do we have to make a conditional and use geom_text instead of geom_text_repel when repel_text = FALSE?

--> catch all ggrepel parameters Should we limit ourselves to the most useful parameters, or is it better to define args for all ggrepel params?

TuomasBorman commented 1 year ago

Not sure, but I think this is only option

geom_text instead of geom_text_repel when repel_text = FALSE

The problem is that ggrepel leads to an error if other than ggrepel parameters are fed to it. I think there are not too many parameters to not catch them all (but other thoughts are welcome)

RiboRings commented 1 year ago

Ok the ggrepel thing is done. Next I saw this comment:

# TODO: Catch all plotReducedDim parameters and feed them to plotReducedDim
    # (plotReducedDim cannot accept unknown parameteers, ... is not possibel to use)

What do you mean it is not possible to use? Because if for example I mispell clor_by, the function still works but doesn't colour the dots, so you can use it if you know which args plotReducedDim takes.

TuomasBorman commented 1 year ago

Ok the ggrepel thing is done. Next I saw this comment:

# TODO: Catch all plotReducedDim parameters and feed them to plotReducedDim
    # (plotReducedDim cannot accept unknown parameteers, ... is not possibel to use)

What do you mean it is not possible to use? Because if for example I mispell clor_by, the function still works but doesn't colour the dots, so you can use it if you know which args plotReducedDim takes.

Ok, that is probably just my mistake --> so just feed parameters with ...

TuomasBorman commented 1 year ago

ahh sorry, pushed to wrong branch...

RiboRings commented 1 year ago

The plotRDA function is ready now. I tried to update its vignettes but pkgdown::build_site() seems to fail:

Reading 'man/mia-datasets.Rd'
Reading 'man/mia-plot-args.Rd'
Reading 'man/miaViz-package.Rd'
Reading 'man/plotAbundance.Rd'
Reading 'man/plotAbundanceDensity.Rd'
Reading 'man/plotColTile.Rd'
Reading 'man/plotDMN.Rd'
Reading 'man/plotGraph.Rd'
Reading 'man/plotPrevalence.Rd'
Reading 'man/plotRDA.Rd'
Reading 'man/plotSeries.Rd'
Reading 'man/plotTree.Rd'
Reading 'man/treeData.Rd'
Writing 'reference/treeData.html'
-- Building articles -----------------------------------------------------------
Reading 'vignettes/miaViz.Rmd'
-- RMarkdown error -------------------------------------------------------------

Quitting from lines 36-39 [unnamed-chunk-2] (miaViz.Rmd)
Warning messages:
1: package ‘BiocStyle’ was built under R version 4.3.1 
2: In grSoftVersion() :
  unable to load shared object '/Library/Frameworks/R.framework/Resources/modules//R_X11.so':
  dlopen(/Library/Frameworks/R.framework/Resources/modules//R_X11.so, 6): Library not loaded: /opt/X11/lib/libSM.6.dylib
  Referenced from: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/modules/R_X11.so
  Reason: image not found
3: In cairoVersion() :
  unable to load shared object '/Library/Frameworks/R.framework/Resources/library/grDevices/libs//cairo.so':
  dlopen(/Library/Frameworks/R.framework/Resources/library/grDevices/libs//cairo.so, 6): Library not loaded: /opt/X11/lib/libXrender.1.dylib
  Referenced from: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/library/grDevices/libs/cairo.so
  Reason: image not found
4: In (function (filename = "Rplot%03d.png", width = 480, height = 480,  :
  failed to load cairo DLL
--------------------------------------------------------------------------------
Warning messages:
1: package ‘MatrixGenerics’ was built under R version 4.3.1 
2: package ‘IRanges’ was built under R version 4.3.1 
3: package ‘GenomeInfoDb’ was built under R version 4.3.1 
4: package ‘MultiAssayExperiment’ was built under R version 4.3.1 
Error: 
! in callr subprocess.
Caused by error in `map(.x, .f, ..., .progress = .progress)`:
! In index: 1.
ℹ See `$stdout` and `$stderr` for standard output and error.
Type .Last.error to see the more details.
RiboRings commented 1 year ago

It might be just because the Cairo package does not work on Mac, does it build for you?

TuomasBorman commented 1 year ago

That might be the problem, vignettes build just fine in my computer

The method looks robust --> now we need plotCCA alias and documents need to be updated. In case of plotRecucedDim and geom_repel paramaters, you can just add tem to additional paramaters and say that ... is fed to these functions

RiboRings commented 1 year ago

I updated the docs, but I am not sure what they look like because I can't compile them in my laptop.

I noticed that the method works only for se/tse objects and not for rda/cca objects, but maybe we can add the rda/cca method at some later point?

TuomasBorman commented 1 year ago

I can check when time allows

I noticed that the method works only for se/tse objects and not for rda/cca objects, but maybe we can add the rda/cca method at some later point?

Actually, my plan was to also implement this so that it works with calculateRDA and rda object. Because you also ended up thinking this, we could implement it

My suggestion:

Because we want that these plots looks similar despite what is the input class

  1. Create TreeSE from rda object. (Get sample names --> create an empty assay --> add reducedDim)
  2. Plot this TreeSE --> with calculateRDA input you could keep significance parameters as they are but with rda object input, disable significance parameter since they are not included

--> Make this work with as little code as possible so that the "real workhorse" is the code that is already implemented

RiboRings commented 1 year ago

Everything ready for your final check.

RiboRings commented 1 year ago

I think it can be merged now

TuomasBorman commented 1 year ago

I think this is very nice

Do you have something to add @antagomir

antagomir commented 1 year ago

Very good. Once this is merged then make sure OMA examples are also updated.

TuomasBorman commented 1 year ago

Forgot to say, you could still add tests to https://github.com/microbiome/miaViz/tree/master/tests/testthat

RiboRings commented 1 year ago

I thought the examples in the vignettes were working as tests. Or should I add tests to https://github.com/microbiome/miaViz/tree/master/tests/testthat anyway?

TuomasBorman commented 1 year ago

Yes, every time we create or modify a function, we should create also tests that test that method is working as expected. The right place is the tests/testthat folder (similarly to other methods).

That is probably the most boring part but ensures that everything is working (and also if we do some modifications later [or if there are changes in dependencies], we get an alert if the functionality of method is changed)

Tests should include

RiboRings commented 1 year ago

Hi! I finished writing some tests.

Not really sure what this error in GH Actions is all about:

Run Rscript -e "remotes::install_github('r-hub/sysreqs')"
   File /usr/local/lib/R/etc//Renviron.site contains invalid line(s)
      <html>
      <head><title>301 Moved Permanently</title></head>
      <body>
      <center><h1>301 Moved Permanently</h1></center>
      <hr><center>CloudFront</center>
      </body>
      </html>
   They were ignored
Using github PAT from envvar GITHUB_PAT
Skipping install of 'sysreqs' from a github remote, the SHA1 (f068afa9) has not changed since last install.
  Use `force = TRUE` to force installation
/bin/bash: eval: line 12: syntax error near unexpected token `('
/bin/bash: eval: line 12: `   File /usr/local/lib/R/etc//Renviron.site contains invalid line(s)      <html>      <head><title>301 Moved Permanently</title></head>      <body>      <center><h1>301 Moved Permanently</h1></center>      <hr><center>CloudFront</center>      </body>      </html>   They were ignoredexport DEBIAN_FRONTEND=noninteractive; apt-get -y update && apt-get install -y pandoc make libxml2-dev libicu-dev libgmp-dev libglpk-dev'
Error: Process completed with exit code 2.

It looks external or?

TuomasBorman commented 1 year ago

Same is in mia. I don't know what is the problem but it is not related to our packages --> it is related to the environment where these tests are run

Let's see what to do

TuomasBorman commented 1 year ago

I might have to merge this to master since I need to use this function today in one demo and it probably would be too complicated to say to load the package from branch... I open an issue about this

TuomasBorman commented 1 year ago

btw superb work!!!!

TuomasBorman commented 1 year ago

I ran the checks locally, test-plotCCA.R:48:3 seems to give an error

TuomasBorman commented 1 year ago

I ran the checks locally, test-plotCCA.R:48:3 seems to give an error

fixed

RiboRings commented 1 year ago

Thanks no problem