Closed ghost closed 6 years ago
What do you get when you do sampclass(xset)
?
If this does not show anything strange I can't help - annotateDiffreport
is a function from the CAMERA
package - not from xcms
.
sampclass(xset) [1] KPCI2 KPCI2 KPCI2 KPCI2 KPCI2 KPCI2 Levels: KPCI2
KPCI2 is the name of a package and also the name of the folder with the files that I used for the analysis.
These are the initial lines for the code above filepath <- "E:/Data/Kundai/RStudio/XCMS/MZXML/KPCI2" files <- list.files(filepath, pattern = "mzXML", recursive = TRUE, full.names = TRUE) xset <- xcmsSet(files)
To me it seems a little strange that you're performing a differential abundance analysis on samples from a single group/sample class. Usually you want to compare signals in samples from one group to those of another group. That might be a reason for the error, but I haven't checked it.
I have managed to zero in on my problem, would you be in a position to assist me in correcting my code so that I am able to generate a diff report. It seems as if the OPLS tutorial was made using the old methods from XCMS.
Pasted below is the code from the XCMS bioconductor page which I can easily follow, from the "readMSdata" function, how can one generate a diff report ?
cdfs <- dir(system.file("cdf", package = "faahKO"), full.names = TRUE, recursive = TRUE)
pd <- data.frame(sample_name = sub(basename(cdfs), pattern = ".CDF", replacement = "", fixed = TRUE), sample_group = c(rep("KO", 6), rep("WT", 6)), stringsAsFactors = FALSE)
raw_data <- readMSData(files = cdfs, pdata = new("NAnnotatedDataFrame", pd), mode = "onDisk")
The new user interface does not provide a function to generate a diff report. The differential abundance analysis is not something that is very specific or special to the data processed in xcms
. I recommend to use e.g. the limma
package for the differential abundance analysis or use a simple t.test
to perform the test. No need to reinvent the wheel by implementing such functionality in xcms
if there are plenty of excellent options already available.
You can still use the old user interface to perform the analysis. just ensure that you specify the sample group assignment with sampclass
to the xcmsSet
.
Really sorry to bother but I am having problems understanding whats going on. I understand your reasons for deprecating the old functions but unfortunately, the ROPLS tutorial was made using the old functions.The only reason why I need a diff report is because of the way the function arranges data making it extremely easy to conduct statistical analysis including POLS, PCA e.t.c. I have asked Etienne A. Thevenot if he could correct his methods to suit the new xcms functionality but he has not responded. Is there a way of generating this diff report or is there a way of specifying the sample groups because my current code does not work. I need the dataMatrix, sampleMetadata and variableMetadata for statistical analysis and there is no current known way of extracting these values in the new xcms
filepath <- "E:/Data/Kundai/RStudio/XCMS/MZXML/KPCI2" files <- list.files(filepath, pattern = "mzXML", recursive = TRUE, full.names = TRUE) pd <- data.frame(sample_name = sub(basename(files), pattern = ".mzXML", replacement = "", fixed = TRUE), sample_group = c(rep("Eta6", 3), rep("Eta8", 3)), stringsAsFactors = FALSE) raw_data <- readMSData(files = files, pdata = new("NAnnotatedDataFrame", pd), mode = "onDisk")
You can still use the xcmsSet
function and the diffreport
on the xcmsSet
. The functions are not (yet) deprecated or defunct. Only thing you have to ensure is that sampclass
corresponds to your sample grouping (i.e. the sample_group
you define).
You can set the sampclass
in your xcmsSet
object with e.g. sampclass(xset) <- sample_group
. Just have a look at the help page for sampclass
(i.e. ?sampclass
). It should be described there.
Thank you for your help. I really appreciate you taking time to help me out. I have managed to process the data as of now.
Really sorry to bother you @jotsetung , but it seems as if you are the only one who can help me and I really appreciate it, I hope I am not abusing it. Anyway, I have managed to make a data matrix but the problem is that it still contains characters and I do not know how to remove them. I asked Etienne Thevenot and he told me that the problem was from XCMS instead of ROPLS but was going to align the new XCMS with ROPLS tutorial.
Pasted below is my code
filepath <- "E:/Data/Kundai/RStudio/XCMS/MZXML/KPCI2" files <- list.files(filepath, pattern = "mzXML", recursive = TRUE, full.names = TRUE) pd <- data.frame(sample_name = sub(basename(files), pattern = ".mzXML", replacement = "", fixed = TRUE), sample_group = c(rep("Eta6", 3), rep("Eta8", 3)), stringsAsFactors = FALSE) raw_data <- readMSData(files = files, pdata = new("NAnnotatedDataFrame", pd), mode = "onDisk") classes <- rep(c("Eta6", "Eta8"), each = 3) xs<-xcmsSet(files, sclass = classes, profmethod = "bin") xset <- group(xs) xset2 <- retcor(xset, family = "symmetric", plottype = "mdevden") xset2 <- group(xset2, bw = 10) xset3 <- fillPeaks(xset2) diffreport <- annotateDiffreport(xset2, quick=TRUE) sampleVc <- grep("^eta6|^eta8", colnames(diffreport), value = TRUE) dataMatrix <- t(as.matrix(diffreport[, sampleVc])) dimnames(dataMatrix) <- list(sampleVc, diffreport[, "name"]) sampleMetadata <- data.frame(row.names = sampleVc,genotypeFc = substr(sampleVc, 1, 2)) variableMetadata <- diffreport[, !(colnames(diffreport) %in% c("name", sampleVc))] rownames(variableMetadata) <- diffreport[, "name"]
library(ropls) opls(dataMatrix)
Error: 'x' matrix must be of 'numeric' mode
You datamatrix comes from diffreport, Which also includes the "name" colum which is character. You need to subset diffreport, or oder groupval() to get the intensity matrix. Yours Steffen
I blame Android for the brevity and typos
---- Nebuchadnezzar schrieb ----
Really sorry to bother you @jotsetunghttps://github.com/jotsetung , but it seems as if you are the only one who can help me and I really appreciate it, I hope I am not abusing it. Anyway, I have managed to make a data matrix but the problem is that it still contains characters and I do not know how to remove them. I asked Etienne Thevenot and he told me that the problem was from XCMS instead of ROPLS but was going to align the new XCMS with ROPLS tutorial.
Pasted below is my code
filepath <- "E:/Data/Kundai/RStudio/XCMS/MZXML/KPCI2" files <- list.files(filepath, pattern = "mzXML", recursive = TRUE, full.names = TRUE) pd <- data.frame(sample_name = sub(basename(files), pattern = ".mzXML", replacement = "", fixed = TRUE), sample_group = c(rep("Eta6", 3), rep("Eta8", 3)), stringsAsFactors = FALSE) raw_data <- readMSData(files = files, pdata = new("NAnnotatedDataFrame", pd), mode = "onDisk") classes <- rep(c("Eta6", "Eta8"), each = 3) xs<-xcmsSet(files, sclass = classes, profmethod = "bin") xset <- group(xs) xset2 <- retcor(xset, family = "symmetric", plottype = "mdevden") xset2 <- group(xset2, bw = 10) xset3 <- fillPeaks(xset2) diffreport <- annotateDiffreport(xset2, quick=TRUE) sampleVc <- grep("^eta6|^eta8", colnames(diffreport), value = TRUE) dataMatrix <- t(as.matrix(diffreport[, sampleVc])) dimnames(dataMatrix) <- list(sampleVc, diffreport[, "name"]) sampleMetadata <- data.frame(row.names = sampleVc,genotypeFc = substr(sampleVc, 1, 2)) variableMetadata <- diffreport[, !(colnames(diffreport) %in% c("name", sampleVc))] rownames(variableMetadata) <- diffreport[, "name"]
library(ropls) opls(dataMatrix)
Error: 'x' matrix must be of 'numeric' mode
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/sneumann/xcms/issues/239#issuecomment-346533625, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AAL7OV-kBX2odpHjA0moum0adyqAlpiPks5s5P-cgaJpZM4QhbZ5.
@sneumann Thank you for your timely response. The problem is subsetting would only work if I knew which columns to keep and which ones to discard. Since I do not know which columns I need later on, I cannot use subset. I have tried the following pieces of code and they have not worked:
dataMatrix <-diffreport[, -c("name")] Error in -c("name") : invalid argument to unary operator
dataMatrix[, c("name")]<-list(NULL) Error in
[<-
(*tmp*
, , c("name"), value = list(NULL)) : subscript out of boundsdataMatrix$name<-NULL Warning message: In dataMatrix$name <- NULL : Coercing LHS to a list opls(dataMatrix) Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘opls’ for signature ‘"list"’
I removed the name column from the diffreport and I am still having the error. I have taken a look at the dataMatrix and it consists of one row with as many columns as there are rows in the diffreport. Is the problem arising from the fact that one cell consists of two values separated by a forward slash which makes it non-numeric ?
diffreport$name<-NULL dataMatrix <- t(as.matrix(diffreport[, sampleVc])) library(ropls) opls(dataMatrix) Error: 'x' matrix must be of 'numeric' mode
How can I fix the problem ?
I am trying to create a data matrix with labels as instructed in the latest version of the "ropls: PCA, PLS(-DA) and OPLS(-DA) for multivariate analysis and feature selection of omics data", which uses XCMS to group and align peaks before correcting retention times.
Here is a link to the tutorial: https://www.bioconductor.org/packages/release/bioc/vignettes/ropls/inst/doc/ropls-vignette.pdf
Here is my code and the errors I have received: xset <- xcmsSet(files)
Session info
Matrix products: default
locale: [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252
[3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
[5] LC_TIME=German_Germany.1252
attached base packages: [1] grid stats4 parallel stats graphics grDevices utils datasets [9] methods base
other attached packages: [1] CAMERA_1.34.0 ropls_1.10.0 Rgraphviz_2.22.0
[4] graph_1.56.0 pander_0.6.1 RColorBrewer_1.1-2
[7] limma_3.34.1 yamss_1.4.0 SummarizedExperiment_1.8.0 [10] DelayedArray_0.4.1 matrixStats_0.52.2 GenomicRanges_1.30.0
[13] GenomeInfoDb_1.14.0 IRanges_2.12.0 S4Vectors_0.16.0
[16] xcms_3.0.0 MSnbase_2.4.0 ProtGenerics_1.10.0
[19] BiocParallel_1.12.0 Biobase_2.38.0 BiocGenerics_0.24.0
[22] mzR_2.12.0 Rcpp_0.12.13 MassSpecWavelet_1.44.0
[25] waveslim_1.7.5 BiocInstaller_1.28.0
loaded via a namespace (and not attached): [1] bitops_1.0-6 EBImage_4.20.0 doParallel_1.0.11
[4] rprojroot_1.2 tools_3.4.1 backports_1.1.1
[7] affyio_1.48.0 rpart_4.1-11 Hmisc_4.0-3
[10] lazyeval_0.2.1 colorspace_1.3-2 nnet_7.3-12
[13] gridExtra_2.3 compiler_3.4.1 preprocessCore_1.40.0
[16] htmlTable_1.9 checkmate_1.8.5 scales_0.5.0
[19] affy_1.56.0 RBGL_1.54.0 stringr_1.2.0
[22] digest_0.6.12 tiff_0.1-5 foreign_0.8-69
[25] fftwtools_0.9-8 rmarkdown_1.7 XVector_0.18.0
[28] pkgconfig_2.0.1 base64enc_0.1-3 jpeg_0.1-8
[31] htmltools_0.3.6 htmlwidgets_0.9 rlang_0.1.4
[34] impute_1.52.0 mzID_1.16.0 acepack_1.4.1
[37] RCurl_1.95-4.8 magrittr_1.5 GenomeInfoDbData_0.99.1 [40] Formula_1.2-2 MALDIquant_1.17 Matrix_1.2-10
[43] munsell_0.4.3 abind_1.4-5 vsn_3.46.0
[46] stringi_1.1.5 MASS_7.3-47 zlibbioc_1.24.0
[49] plyr_1.8.4 lattice_0.20-35 splines_3.4.1
[52] multtest_2.34.0 locfit_1.5-9.1 knitr_1.17
[55] igraph_1.1.2 codetools_0.2-15 XML_3.98-1.9
[58] evaluate_0.10.1 latticeExtra_0.6-28 pcaMethods_1.70.0
[61] data.table_1.10.4-3 png_0.1-7 foreach_1.4.3
[64] gtable_0.2.0 RANN_2.5.1 ggplot2_2.2.1
[67] rsconnect_0.8.5 survival_2.41-3 tibble_1.3.4
[70] snow_0.4-2 iterators_1.0.8 cluster_2.0.6