HCBravoLab / metagenomeSeq

Statistical analysis for sparse high-throughput sequencing
64 stars 20 forks source link

cumNorm wants a matrix or MRexperiment as input, but will not accept either of them #76

Closed sklasek closed 4 years ago

sklasek commented 4 years ago

Hello, I'm getting the runaround when trying to run cumNorm. It wants either a matrix or an MRexperiment as input, but will not accept either:

mr.r3.rhizo <- newMRexperiment(otu_table(r3.rhizo))

mr.r3.rhizo # successfully creates MRexperiment

is.matrix(otu_table(r3.rhizo)) # TRUE

r3.rhizo.css <- cumNorm(mr.r3.rhizo) # Error in returnAppropriateObj(obj, norm = FALSE, log = FALSE) # Object needs to be either a MRexperiment object or matrix

r3.rhizo.css <- cumNorm(otu_table(r3.rhizo)) # Error in cumNorm(otu_table(r3.rhizo)) : Object needs to be a MRexperiment object

I've tried transposing my matrix, and this did not help. I'm running R 3.6.2 in Rstudio, using metagenomeSeq 1.28.0. Any advice would be appreciated! Thanks.

dombraccia commented 4 years ago

Couple things:

  1. could you send the output of running sessionInfo() from the console.

  2. could you check that cumNorm works with test data provided by metagenomeSeq with:

data(mouseData)
mouseData <- cumNorm(mouseData)
head(normFactors(mouseData))

There could be a bug affecting cumNorm's function from v1.28.0. Try updating to the newest version of metagenomeSeq (v1.29.1) and see if this alleviates your error. If none of this works, we may need to look deeper into the data you are giving newMRexperiment().

sklasek commented 4 years ago

Hi dombraccia, Here's what I see for the head of mouseData: PM1:20080107 PM1:20080108 PM1:20080114 PM1:20071211 PM1:20080121 PM1:20071217 152 229 257 221 246 258 It appears I can't download the developer version of metagenomeSeq(v1.29.1) without updating R to 4.0. I'm hesitant to do that and rerun some really long chunks of code. Below is my session info: `R version 3.6.2 (2019-12-12) Platform: x86_64-pc-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core)

Matrix products: default BLAS: /automounts/bioware/bioware/linuxOpteron/R-3.6.2/lib64/R/lib/libRblas.so LAPACK: /automounts/bioware/bioware/linuxOpteron/R-3.6.2/lib64/R/lib/libRlapack.so

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] parallel stats graphics grDevices utils datasets methods base

other attached packages: [1] phyloseq_1.30.0 metagenomeSeq_1.28.0 RColorBrewer_1.1-2 glmnet_3.0-2 Matrix_1.2-18 limma_3.42.0 Biobase_2.46.0
[8] BiocGenerics_0.32.0

loaded via a namespace (and not attached): [1] nlme_3.1-143 bitops_1.0-6 matrixStats_0.55.0 bit64_0.9-7 GenomeInfoDb_1.22.0
[6] tools_3.6.2 backports_1.1.5 R6_2.4.1 vegan_2.5-6 KernSmooth_2.23-16
[11] rpart_4.1-15 Hmisc_4.3-0 DBI_1.1.0 lazyeval_0.2.2 mgcv_1.8-31
[16] colorspace_1.4-1 permute_0.9-5 ade4_1.7-13 nnet_7.3-12 tidyselect_0.2.5
[21] gridExtra_2.3 DESeq2_1.26.0 bit_1.1-14 compiler_3.6.2 fdrtool_1.2.15
[26] htmlTable_1.13.3 DelayedArray_0.12.2 slam_0.1-47 caTools_1.17.1.3 scales_1.1.0
[31] checkmate_1.9.4 genefilter_1.68.0 stringr_1.4.0 digest_0.6.23 foreign_0.8-74
[36] XVector_0.26.0 base64enc_0.1-3 jpeg_0.1-8.1 pkgconfig_2.0.3 htmltools_0.4.0
[41] lpsymphony_1.14.0 htmlwidgets_1.5.1 rlang_0.4.2 rstudioapi_0.10 RSQLite_2.2.0
[46] shape_1.4.4 IHW_1.14.0 jsonlite_1.6 gtools_3.8.1 BiocParallel_1.20.1
[51] acepack_1.4.1 dplyr_0.8.3 RCurl_1.95-4.12 magrittr_1.5 GenomeInfoDbData_1.2.2
[56] Formula_1.2-3 biomformat_1.14.0 Rcpp_1.0.3 munsell_0.5.0 S4Vectors_0.24.1
[61] Rhdf5lib_1.8.0 ape_5.3 lifecycle_0.1.0 stringi_1.4.5 MASS_7.3-51.5
[66] SummarizedExperiment_1.16.1 zlibbioc_1.32.0 gplots_3.0.1.1 rhdf5_2.30.1 plyr_1.8.5
[71] grid_3.6.2 blob_1.2.0 gdata_2.18.0 crayon_1.3.4 lattice_0.20-38
[76] Biostrings_2.54.0 splines_3.6.2 multtest_2.42.0 annotate_1.64.0 locfit_1.5-9.1
[81] zeallot_0.1.0 knitr_1.26 pillar_1.4.3 igraph_1.2.4.2 GenomicRanges_1.38.0
[86] Wrench_1.4.0 geneplotter_1.64.0 reshape2_1.4.3 codetools_0.2-16 stats4_3.6.2
[91] XML_3.98-1.20 glue_1.3.1 latticeExtra_0.6-29 BiocManager_1.30.10 data.table_1.12.8
[96] png_0.1-7 vctrs_0.2.1 foreach_1.4.7 gtable_0.3.0 purrr_0.3.3
[101] assertthat_0.2.1 ggplot2_3.2.1 xfun_0.11 xtable_1.8-4 survival_3.1-8
[106] tibble_2.1.3 iterators_1.0.12 memoise_1.1.0 AnnotationDbi_1.48.0 IRanges_2.20.1
[111] cluster_2.1.0 R version 3.6.2 (2019-12-12) Platform: x86_64-pc-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core)

Matrix products: default BLAS: /automounts/bioware/bioware/linuxOpteron/R-3.6.2/lib64/R/lib/libRblas.so LAPACK: /automounts/bioware/bioware/linuxOpteron/R-3.6.2/lib64/R/lib/libRlapack.so

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] parallel stats graphics grDevices utils datasets methods base

other attached packages: [1] phyloseq_1.30.0 metagenomeSeq_1.28.0 RColorBrewer_1.1-2 glmnet_3.0-2 Matrix_1.2-18 limma_3.42.0 Biobase_2.46.0
[8] BiocGenerics_0.32.0

loaded via a namespace (and not attached): [1] nlme_3.1-143 bitops_1.0-6 matrixStats_0.55.0 bit64_0.9-7 GenomeInfoDb_1.22.0
[6] tools_3.6.2 backports_1.1.5 R6_2.4.1 vegan_2.5-6 KernSmooth_2.23-16
[11] rpart_4.1-15 Hmisc_4.3-0 DBI_1.1.0 lazyeval_0.2.2 mgcv_1.8-31
[16] colorspace_1.4-1 permute_0.9-5 ade4_1.7-13 nnet_7.3-12 tidyselect_0.2.5
[21] gridExtra_2.3 DESeq2_1.26.0 bit_1.1-14 compiler_3.6.2 fdrtool_1.2.15
[26] htmlTable_1.13.3 DelayedArray_0.12.2 slam_0.1-47 caTools_1.17.1.3 scales_1.1.0
[31] checkmate_1.9.4 genefilter_1.68.0 stringr_1.4.0 digest_0.6.23 foreign_0.8-74
[36] XVector_0.26.0 base64enc_0.1-3 jpeg_0.1-8.1 pkgconfig_2.0.3 htmltools_0.4.0
[41] lpsymphony_1.14.0 htmlwidgets_1.5.1 rlang_0.4.2 rstudioapi_0.10 RSQLite_2.2.0
[46] shape_1.4.4 IHW_1.14.0 jsonlite_1.6 gtools_3.8.1 BiocParallel_1.20.1
[51] acepack_1.4.1 dplyr_0.8.3 RCurl_1.95-4.12 magrittr_1.5 GenomeInfoDbData_1.2.2
[56] Formula_1.2-3 biomformat_1.14.0 Rcpp_1.0.3 munsell_0.5.0 S4Vectors_0.24.1
[61] Rhdf5lib_1.8.0 ape_5.3 lifecycle_0.1.0 stringi_1.4.5 MASS_7.3-51.5
[66] SummarizedExperiment_1.16.1 zlibbioc_1.32.0 gplots_3.0.1.1 rhdf5_2.30.1 plyr_1.8.5
[71] grid_3.6.2 blob_1.2.0 gdata_2.18.0 crayon_1.3.4 lattice_0.20-38
[76] Biostrings_2.54.0 splines_3.6.2 multtest_2.42.0 annotate_1.64.0 locfit_1.5-9.1
[81] zeallot_0.1.0 knitr_1.26 pillar_1.4.3 igraph_1.2.4.2 GenomicRanges_1.38.0
[86] Wrench_1.4.0 geneplotter_1.64.0 reshape2_1.4.3 codetools_0.2-16 stats4_3.6.2
[91] XML_3.98-1.20 glue_1.3.1 latticeExtra_0.6-29 BiocManager_1.30.10 data.table_1.12.8
[96] png_0.1-7 vctrs_0.2.1 foreach_1.4.7 gtable_0.3.0 purrr_0.3.3
[101] assertthat_0.2.1 ggplot2_3.2.1 xfun_0.11 xtable_1.8-4 survival_3.1-8
[106] tibble_2.1.3 iterators_1.0.12 memoise_1.1.0 AnnotationDbi_1.48.0 IRanges_2.20.1
[111] cluster_2.1.0 R version 3.6.2 (2019-12-12) Platform: x86_64-pc-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core)

Matrix products: default BLAS: /automounts/bioware/bioware/linuxOpteron/R-3.6.2/lib64/R/lib/libRblas.so LAPACK: /automounts/bioware/bioware/linuxOpteron/R-3.6.2/lib64/R/lib/libRlapack.so

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] parallel stats graphics grDevices utils datasets methods base

other attached packages: [1] phyloseq_1.30.0 metagenomeSeq_1.28.0 RColorBrewer_1.1-2 glmnet_3.0-2 Matrix_1.2-18 limma_3.42.0 Biobase_2.46.0
[8] BiocGenerics_0.32.0

loaded via a namespace (and not attached): [1] nlme_3.1-143 bitops_1.0-6 matrixStats_0.55.0 bit64_0.9-7 GenomeInfoDb_1.22.0
[6] tools_3.6.2 backports_1.1.5 R6_2.4.1 vegan_2.5-6 KernSmooth_2.23-16
[11] rpart_4.1-15 Hmisc_4.3-0 DBI_1.1.0 lazyeval_0.2.2 mgcv_1.8-31
[16] colorspace_1.4-1 permute_0.9-5 ade4_1.7-13 nnet_7.3-12 tidyselect_0.2.5
[21] gridExtra_2.3 DESeq2_1.26.0 bit_1.1-14 compiler_3.6.2 fdrtool_1.2.15
[26] htmlTable_1.13.3 DelayedArray_0.12.2 slam_0.1-47 caTools_1.17.1.3 scales_1.1.0
[31] checkmate_1.9.4 genefilter_1.68.0 stringr_1.4.0 digest_0.6.23 foreign_0.8-74
[36] XVector_0.26.0 base64enc_0.1-3 jpeg_0.1-8.1 pkgconfig_2.0.3 htmltools_0.4.0
[41] lpsymphony_1.14.0 htmlwidgets_1.5.1 rlang_0.4.2 rstudioapi_0.10 RSQLite_2.2.0
[46] shape_1.4.4 IHW_1.14.0 jsonlite_1.6 gtools_3.8.1 BiocParallel_1.20.1
[51] acepack_1.4.1 dplyr_0.8.3 RCurl_1.95-4.12 magrittr_1.5 GenomeInfoDbData_1.2.2
[56] Formula_1.2-3 biomformat_1.14.0 Rcpp_1.0.3 munsell_0.5.0 S4Vectors_0.24.1
[61] Rhdf5lib_1.8.0 ape_5.3 lifecycle_0.1.0 stringi_1.4.5 MASS_7.3-51.5
[66] SummarizedExperiment_1.16.1 zlibbioc_1.32.0 gplots_3.0.1.1 rhdf5_2.30.1 plyr_1.8.5
[71] grid_3.6.2 blob_1.2.0 gdata_2.18.0 crayon_1.3.4 lattice_0.20-38
[76] Biostrings_2.54.0 splines_3.6.2 multtest_2.42.0 annotate_1.64.0 locfit_1.5-9.1
[81] zeallot_0.1.0 knitr_1.26 pillar_1.4.3 igraph_1.2.4.2 GenomicRanges_1.38.0
[86] Wrench_1.4.0 geneplotter_1.64.0 reshape2_1.4.3 codetools_0.2-16 stats4_3.6.2
[91] XML_3.98-1.20 glue_1.3.1 latticeExtra_0.6-29 BiocManager_1.30.10 data.table_1.12.8
[96] png_0.1-7 vctrs_0.2.1 foreach_1.4.7 gtable_0.3.0 purrr_0.3.3
[101] assertthat_0.2.1 ggplot2_3.2.1 xfun_0.11 xtable_1.8-4 survival_3.1-8
[106] tibble_2.1.3 iterators_1.0.12 memoise_1.1.0 AnnotationDbi_1.48.0 IRanges_2.20.1
[111] cluster_2.1.0`

dombraccia commented 4 years ago

As far as I can tell, R is only at version 3.6.2, so it appears you are up to date.. could you share the error report for when you try to update metagenomeSeq?

running:

BiocManager::install("HCBravoLab/metagenomeSeq")

should update the package for you automatically.

sklasek commented 4 years ago

Ah ok, thanks for the command–I wasn't sure how to install from github. However, with v1.29.1 I get exactly the same error messages and the same output when running head(normFactors(mouseData)). If it helps, I've attached a .csv of the otu table I'm working with.

mysterious.otu.table.txt

dombraccia commented 4 years ago

@sklasek I downloaded your link to "mysterious.otu.table.txt" and successfully created an MRexperiment object & ran cumNorm with the following code:

library(metagenomeSeq)
library(phyloseq)

tmp <- read.csv("Downloads/mysterious.otu.table.txt", header = TRUE)
rownames(tmp) <- tmp[,1]
tmp <- tmp[,-1] # removing X column (now in rownames)
tmp <- t(tmp) # I am assuming the 20,796 variables are OTUs and the 
              # 283 observations are the samples, so I transposed here
class(tmp)
## [1] "matrix"

dim(tmp)
## [1] 20795   283

tmpMR <- newMRexperiment(tmp)
class(tmpMR)
## [1] "MRexperiment"
## attr(,"package")
## [1] "metagenomeSeq"

tmpMRnormed <- cumNorm(tmpMR) # runs successfully
## Default value being used. 

# checking that cumNorm worked
normFactors(tmpMR)
##         [,1]
## A022243   NA
## A022353   NA
## A022443   NA
## .
## .
## .
## S144316   NA
## S145116   NA
## S145216   NA
## attr(,"names")
##   [1] "A022243" "A022353" "A022443" "A031133" "A031243" "A031353" "A032143"
##   [8] "A032233" "A032343" "A033253" "A033343" "A033443" "A034143" "A034243"
## .
## .
## .
## [274] "S142116" "S142316" "S143116" "S143216" "S143316" "S144116" "S144216"
## [281] "S144316" "S145116" "S145216"

normFactors(tmpMRnormed)
## A022243 A022353 A022443 A031133 A031243 A031353 A032143 A032233 A032343 A033253 
##    1564    2475    2698    1523    2060    1921    1795    2039    1468    1992 
## A033343 A033443 A034143 A034243 A034333 A035133 A035243 A035343 S022153 S022253 
##    1988    2103    2517    2171    2550    2346    2482    1748    4854    3445 
## .
## .
## .
## S141116 S141216 S141316 S142116 S142316 S143116 S143216 S143316 S144116 S144216 
##    4255    4128    3594    3741    4026    4112    3896    1631    3911    4068 
## S144316 S145116 S145216 
##    3285    3355   13894 

My guess is that there is an issue when running phyloseq::otu_table on your phyloseq object. The code I gave above should be a reasonable work around, though it may be better in the long run to figure out what otu_table is giving you.

Let me know if this workaround works on your machine.

sklasek commented 4 years ago

Solved. Thanks @dombraccia !

I was trying to transform the OTU table from my phyloseq object otu_table(r3.rhizo), which I thought was a matrix. Running class(otu_table(r3.rhizo)) returns "otu_table" attr(,"package") [1] "phyloseq". Whatever the reason, the phyloseq otu table isn't in the correct format.

I ended up reimporting the .csv file I uploaded above and converting it to a matrix. Worked fine after that.


rownames(r3.rhizo.otu) <- r3.rhizo.otu[,1] # set sample names as row names
r3.rhizo.otu <- r3.rhizo.otu[,-1] # remove the first column containing the sample names
r3.rhizo.otu <- t(r3.rhizo.otu) # transpose
class(r3.rhizo.otu) # now it's a matrix
r3.rhizo.otu.MR <- newMRexperiment(r3.rhizo.otu)
r3.rhizo.otu.css <- cumNorm(r3.rhizo.otu.MR)
sklasek commented 4 years ago

Thanks Domenick. I've closed the issue.

On Mon, Jan 27, 2020 at 9:35 PM Domenick J. Braccia < notifications@github.com> wrote:

@sklasek https://github.com/sklasek I downloaded your link to mysterious.otu.table.txt and successfully created an MRexperiment object & ran cumNorm with the following code:

library(metagenomeSeq) library(phyloseq)

tmp <- read.csv("Downloads/mysterious.otu.table.txt", header = TRUE) rownames(tmp) <- tmp[,1] tmp <- tmp[,-1] # removing X column (now in rownames) tmp <- t(tmp) # I am assuming the 20,796 variables are OTUs and the

283 observations are the samples, so I transposed here

class(tmp)

[1] "matrix"

dim(tmp)

[1] 20795 283

tmpMR <- newMRexperiment(tmp) class(tmpMR)

[1] "MRexperiment"

attr(,"package")

[1] "metagenomeSeq"

tmpMRnormed <- cumNorm(tmpMR) # runs successfully

Default value being used.

checking that cumNorm worked

normFactors(tmpMR)

[,1]

A022243 NA

A022353 NA

A022443 NA

.

.

.

S144316 NA

S145116 NA

S145216 NA

attr(,"names")

[1] "A022243" "A022353" "A022443" "A031133" "A031243" "A031353" "A032143"

[8] "A032233" "A032343" "A033253" "A033343" "A033443" "A034143" "A034243"

.

.

.

[274] "S142116" "S142316" "S143116" "S143216" "S143316" "S144116" "S144216"

[281] "S144316" "S145116" "S145216"

normFactors(tmpMRnormed)

A022243 A022353 A022443 A031133 A031243 A031353 A032143 A032233 A032343 A033253

1564 2475 2698 1523 2060 1921 1795 2039 1468 1992

A033343 A033443 A034143 A034243 A034333 A035133 A035243 A035343 S022153 S022253

1988 2103 2517 2171 2550 2346 2482 1748 4854 3445

.

.

.

S141116 S141216 S141316 S142116 S142316 S143116 S143216 S143316 S144116 S144216

4255 4128 3594 3741 4026 4112 3896 1631 3911 4068

S144316 S145116 S145216

3285 3355 13894

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/HCBravoLab/metagenomeSeq/issues/76?email_source=notifications&email_token=ANDH7ZXQ3KXIRXCV6ZQFIOLQ76KXTA5CNFSM4KLNBBM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKB2EFY#issuecomment-579052055, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANDH7ZXJ7JKSU475WUD2A7TQ76KXTANCNFSM4KLNBBMQ .