gavinsimpson / ggvegan

ggplot-based plots for vegan
https://gavinsimpson.github.io/ggvegan/
GNU General Public License v2.0
113 stars 30 forks source link

options to improve CCA autoplot in ggvegan #18

Closed balathl closed 6 years ago

balathl commented 6 years ago

Hi,

I am trying to plot the CCA with sample "sites" coloured by their site info. I did not find any option with autoplot function in ggvegan. I want to make sure it is not available here before i try some other tools. Thanks in advance!

Below is my code: install.packages("devtools") devtools::install_github("biom", "joey711") library(biom) library(vegan) otus.biom <- read_biom('otu.table.json.fmt.biom') otus <- as.matrix(biom_data(otus.biom)) otus <- t(otus) map <- read.csv('DNA.dis.microbiol.txt', sep='\t', header=T, row.names=1) common.ids <- intersect(rownames(map), rownames(otus)) otus <- otus[common.ids,] map <- map[common.ids,] my.ca <- cca(otus) plot(my.ca) my.cca <- cca(otus ~ AOC + MAP + HPC + VLP + DAPI, data=map, na.action=na.exclude) devtools::install_github("gavinsimpson/ggvegan") library("ggvegan") autoplot(my.cca)

Basically, what i look in ggvegan package is to have options to use the "units" column in below mapping file for the colouring of sites. Also, i want to reduce the size of the "species".

head(map,100) coding units AOC MAP HPC VLP DAPI 1B06_DNA_Co_C_1_wk1 1B06.DNA.Co.C.1.wk1 C 127 0.04 5 7140000 120000 1C04_DNA_Co_C_1_wk2 1C04.DNA.Co.C.1.wk2 C 126 0.04 5 6700000 77000 2B06_DNA_Co_C_1_wk1 2B06.DNA.Co.C.1.wk1 C 210 0.1 30 8000000 92000 2C03_DNA_Co_C_1_wk2 2C03.DNA.Co.C.1.wk2 C 151 0.04 20 8200000 120000 3B12_DNA_Co_C_1_wk1 3B12.DNA.Co.C.1.wk1 C 168 0.73 20 390000 78000 3C04_DNA_Co_C_1_wk2 3C04.DNA.Co.C.1.wk2 C 152 0.69 40 2600000 71000 4G02_DNA_Co_C_1_wk2 4G02.DNA.Co.C.1.wk2 C 90 0.09 350 5900000 110000 1B07_DNA_Co_C_2_wk1 1B07.DNA.Co.C.2.wk1 C 198 0.1 5 NA 67000 1C05_DNA_Co_C_2_wk2 1C05.DNA.Co.C.2.wk2 C 171 0.04 10 7400000 41000 2B07_DNA_Co_C_2_wk1 2B07.DNA.Co.C.2.wk1 C 193 0.1 30 9000000 98000 2C04_DNA_Co_C_2_wk2 2C04.DNA.Co.C.2.wk2 C 137 0.04 110 9300000 90000 3C01_DNA_Co_C_2_wk1 3C01.DNA.Co.C.2.wk1 C 140 0.58 20 3700000 11000 3C05_DNA_Co_C_2_wk2 3C05.DNA.Co.C.2.wk2 C 125 0.77 70 400000 62000 4F06_DNA_Co_C_2_wk1 4F06.DNA.Co.C.2.wk1 C 92 0.18 5 4500000 81000 4G03_DNA_Co_C_2_wk2 4G03.DNA.Co.C.2.wk2 C 89 0.07 110 4900000 48000 1C01_DNA_Co_C_3_wk1 1C01.DNA.Co.C.3.wk1 C 126 0.04 5 8000000 63000 1C07_DNA_Co_C_3_wk2 1C07.DNA.Co.C.3.wk2 C 254 0.04 10 6000000 55000 2B09_DNA_Co_C_3_wk1 2B09.DNA.Co.C.3.wk1 C 154 0.04 70 8300000 120000 2C06_DNA_Co_C_3_wk2 2C06.DNA.Co.C.3.wk2 C 146 0.19 150 6700000 165000 3C03_DNA_Co_C_3_wk1 3C03.DNA.Co.C.3.wk1 C 173 0.52 240 180000 62000 3C07_DNA_Co_C_3_wk2 3C07.DNA.Co.C.3.wk2 C 93 0.88 1100 460000 74000 4F08_DNA_Co_C_3_wk1 4F08.DNA.Co.C.3.wk1 C 84 0.04 70 1200000 76000 1B09_DNA_Co_D_1_wk1 1B09.DNA.Co.D.1.wk1 D 117 0.1 1200 960000 69000 1C08_DNA_Co_D_1_wk2 1C08.DNA.Co.D.1.wk2 D 187 0.1 1200 1500000 36000 2B10_DNA_Co_D_1_wk1 2B10.DNA.Co.D.1.wk1 D 196 0.1 1100 950000 20000 2C07_DNA_Co_D_1_wk2 2C07.DNA.Co.D.1.wk2 D 108 0.16 1000 1000000 23000 3B08_DNA_Co_D_1_wk1 3B08.DNA.Co.D.1.wk1 D 118 0.36 1900 2800000 81000 3C08_DNA_Co_D_1_wk2 3C08.DNA.Co.D.1.wk2 D 96 0.18 6700 3100000 72000 4F10_DNA_Co_D_1_wk1 4F10.DNA.Co.D.1.wk1 D 58 0.1 1500 4800000 26000 4H02_DNA_Co_D_1_wk2 4H02.DNA.Co.D.1.wk2 D NA 2100 1600000 35000 1B10_DNA_Co_D_2_wk1 1B10.DNA.Co.D.2.wk1 D 128 0.1 160 730000 51000 1C09_DNA_Co_D_2_wk2 1C09.DNA.Co.D.2.wk2 D 186 0.1 150 850000 27000 2B11_DNA_Co_D_2_wk1 2B11.DNA.Co.D.2.wk1 D 240 0.1 1100 1100000 15000 2C08_DNA_Co_D_2_wk2 2C08.DNA.Co.D.2.wk2 D 123 0.09 510 1000000 20000 3B09_DNA_Co_D_2_wk1 3B09.DNA.Co.D.2.wk1 D 120 0.43 1000 2300000 60000 3C09_DNA_Co_D_2_wk2 3C09.DNA.Co.D.2.wk2 D 139 0.16 1400 2500000 56000 4F11_DNA_Co_D_2_wk1 4F11.DNA.Co.D.2.wk1 D 95 0.07 1100 1800000 29000 4H03_DNA_Co_D_2_wk2 4H03.DNA.Co.D.2.wk2 D NA 240 990000 30000 1B12_DNA_Co_D_3_wk1 1B12.DNA.Co.D.3.wk1 D 103 0.1 550 740000 40000 1C11_DNA_Co_D_3_wk2 1C11.DNA.Co.D.3.wk2 D 175 0.1 470 950000 46000 2C01_DNA_Co_D_3_wk1 2C01.DNA.Co.D.3.wk1 D 257 0.1 1600 990000 33000 2C10_DNA_Co_D_3_wk2 2C10.DNA.Co.D.3.wk2 D 78 0.18 1600 1200000 42000 3B11_DNA_Co_D_3_wk1 3B11.DNA.Co.D.3.wk1 D 137 0.49 2200 2600000 63000 3C11_DNA_Co_D_3_wk2 3C11.DNA.Co.D.3.wk2 D 174 0.47 5100 2000000 54000 4G01_DNA_Co_D_3_wk1 4G01.DNA.Co.D.3.wk1 D 66 0.14 1000 1600000 20000 4H05_DNA_Co_D_3_wk2 4H05.DNA.Co.D.3.wk2 D NA 500 1400000 20000 3C12_DNA_Co_E_1_wk1 3C12.DNA.Co.E.1.wk1 E 74 5 20 NA 17000 1D01_DNA_Co_E_2_wk1 1D01.DNA.Co.E.2.wk1 E 64 3 50 380000 23000 1D05_DNA_Co_E_2_wk2 1D05.DNA.Co.E.2.wk2 E 63 4 90 730000 34000 2C12_DNA_Co_E_2_wk1 2C12.DNA.Co.E.2.wk1 E 9 4 20 710000 41000 3D06_DNA_Co_E_2_wk2 3D06.DNA.Co.E.2.wk2 E 96 6 100 510000 41000 4G06_DNA_Co_E_2_wk1 4G06.DNA.Co.E.2.wk1 E 52 5 70 540000 50000

Let me know the possibilities! Thanks in advance for the help.

Best Regards, Bala

gavinsimpson commented 6 years ago

I don't think autoplot() methods for ordinations will ever have the level of control you are looking for. Instead you should do this using fortify(). At the moment the fortify() method for objects of class "cca" will get you a tidy data frame of scores. What you need to do is add the unit column to the data frame of scores. Typically this is done with the data argument, but I haven't implemented this yet in ggvegan.

What I suggest therefore is that you extract the scores using fortify(my.cca, display = "sites") and then (after checking your OTU data are in the same row order as the units), just cbind() or dplyr::bind_cols() the unit vector on to the data frame of scores.