sneumann / xcms

This is the git repository matching the Bioconductor package xcms: LC/MS and GC/MS Data Analysis
Other
185 stars 80 forks source link

Error in .local(object, ...) : Incorrect Class Labels #239

Closed ghost closed 6 years ago

ghost commented 6 years ago

I am trying to create a data matrix with labels as instructed in the latest version of the "ropls: PCA, PLS(-DA) and OPLS(-DA) for multivariate analysis and feature selection of omics data", which uses XCMS to group and align peaks before correcting retention times.

Here is a link to the tutorial: https://www.bioconductor.org/packages/release/bioc/vignettes/ropls/inst/doc/ropls-vignette.pdf

Here is my code and the errors I have received: xset <- xcmsSet(files)

xset <- group(xset) Processing 7180 mz slices ... OK xset2 <- retcor(xset, family = "symmetric", plottype = "mdevden") Performing retention time correction using 1951 peak groups. Error in plot.new() : figure margins too large xset2 <- group(xset2, bw = 10) Error in group(xset2, bw = 10) : object 'xset2' not found xset2 <- retcor(xset, family = "symmetric", plottype = "mdevden") Performing retention time correction using 1951 peak groups. Error in plot.new() : figure margins too large xset2 <- retcor(xset, family = "symmetric", plottype = "mdevden") Performing retention time correction using 1951 peak groups. xset2 <- group(xset2, bw = 10) Processing 7180 mz slices ... OK xset3 <- fillPeaks(xset2) Lade nötiges Paket: xcms Lade nötiges Paket: Biobase Lade nötiges Paket: BiocGenerics Lade nötiges Paket: parallel

library(CAMERA) Warning message: Paket ‘CAMERA’ wurde unter R Version 3.4.2 erstellt

diffreport <- annotateDiffreport(xset3, quick=TRUE) Error in .local(object, ...) : Incorrect Class Labels

diffreport <- annotateDiffreport(xset3) Error in .local(object, ...) : Incorrect Class Labels

Session info

sessionInfo() R version 3.4.1 (2017-06-30) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale: [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252
[3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
[5] LC_TIME=German_Germany.1252

attached base packages: [1] grid stats4 parallel stats graphics grDevices utils datasets [9] methods base

other attached packages: [1] CAMERA_1.34.0 ropls_1.10.0 Rgraphviz_2.22.0
[4] graph_1.56.0 pander_0.6.1 RColorBrewer_1.1-2
[7] limma_3.34.1 yamss_1.4.0 SummarizedExperiment_1.8.0 [10] DelayedArray_0.4.1 matrixStats_0.52.2 GenomicRanges_1.30.0
[13] GenomeInfoDb_1.14.0 IRanges_2.12.0 S4Vectors_0.16.0
[16] xcms_3.0.0 MSnbase_2.4.0 ProtGenerics_1.10.0
[19] BiocParallel_1.12.0 Biobase_2.38.0 BiocGenerics_0.24.0
[22] mzR_2.12.0 Rcpp_0.12.13 MassSpecWavelet_1.44.0
[25] waveslim_1.7.5 BiocInstaller_1.28.0

loaded via a namespace (and not attached): [1] bitops_1.0-6 EBImage_4.20.0 doParallel_1.0.11
[4] rprojroot_1.2 tools_3.4.1 backports_1.1.1
[7] affyio_1.48.0 rpart_4.1-11 Hmisc_4.0-3
[10] lazyeval_0.2.1 colorspace_1.3-2 nnet_7.3-12
[13] gridExtra_2.3 compiler_3.4.1 preprocessCore_1.40.0
[16] htmlTable_1.9 checkmate_1.8.5 scales_0.5.0
[19] affy_1.56.0 RBGL_1.54.0 stringr_1.2.0
[22] digest_0.6.12 tiff_0.1-5 foreign_0.8-69
[25] fftwtools_0.9-8 rmarkdown_1.7 XVector_0.18.0
[28] pkgconfig_2.0.1 base64enc_0.1-3 jpeg_0.1-8
[31] htmltools_0.3.6 htmlwidgets_0.9 rlang_0.1.4
[34] impute_1.52.0 mzID_1.16.0 acepack_1.4.1
[37] RCurl_1.95-4.8 magrittr_1.5 GenomeInfoDbData_0.99.1 [40] Formula_1.2-2 MALDIquant_1.17 Matrix_1.2-10
[43] munsell_0.4.3 abind_1.4-5 vsn_3.46.0
[46] stringi_1.1.5 MASS_7.3-47 zlibbioc_1.24.0
[49] plyr_1.8.4 lattice_0.20-35 splines_3.4.1
[52] multtest_2.34.0 locfit_1.5-9.1 knitr_1.17
[55] igraph_1.1.2 codetools_0.2-15 XML_3.98-1.9
[58] evaluate_0.10.1 latticeExtra_0.6-28 pcaMethods_1.70.0
[61] data.table_1.10.4-3 png_0.1-7 foreach_1.4.3
[64] gtable_0.2.0 RANN_2.5.1 ggplot2_2.2.1
[67] rsconnect_0.8.5 survival_2.41-3 tibble_1.3.4
[70] snow_0.4-2 iterators_1.0.8 cluster_2.0.6

jorainer commented 6 years ago

What do you get when you do sampclass(xset)?

If this does not show anything strange I can't help - annotateDiffreport is a function from the CAMERA package - not from xcms.

ghost commented 6 years ago

sampclass(xset) [1] KPCI2 KPCI2 KPCI2 KPCI2 KPCI2 KPCI2 Levels: KPCI2

KPCI2 is the name of a package and also the name of the folder with the files that I used for the analysis.

These are the initial lines for the code above filepath <- "E:/Data/Kundai/RStudio/XCMS/MZXML/KPCI2" files <- list.files(filepath, pattern = "mzXML", recursive = TRUE, full.names = TRUE) xset <- xcmsSet(files)

jorainer commented 6 years ago

To me it seems a little strange that you're performing a differential abundance analysis on samples from a single group/sample class. Usually you want to compare signals in samples from one group to those of another group. That might be a reason for the error, but I haven't checked it.

ghost commented 6 years ago

I have managed to zero in on my problem, would you be in a position to assist me in correcting my code so that I am able to generate a diff report. It seems as if the OPLS tutorial was made using the old methods from XCMS.

Pasted below is the code from the XCMS bioconductor page which I can easily follow, from the "readMSdata" function, how can one generate a diff report ?

cdfs <- dir(system.file("cdf", package = "faahKO"), full.names = TRUE, recursive = TRUE)

Create a phenodata data.frame

pd <- data.frame(sample_name = sub(basename(cdfs), pattern = ".CDF", replacement = "", fixed = TRUE), sample_group = c(rep("KO", 6), rep("WT", 6)), stringsAsFactors = FALSE)

raw_data <- readMSData(files = cdfs, pdata = new("NAnnotatedDataFrame", pd), mode = "onDisk")

jorainer commented 6 years ago

The new user interface does not provide a function to generate a diff report. The differential abundance analysis is not something that is very specific or special to the data processed in xcms. I recommend to use e.g. the limma package for the differential abundance analysis or use a simple t.test to perform the test. No need to reinvent the wheel by implementing such functionality in xcms if there are plenty of excellent options already available.

You can still use the old user interface to perform the analysis. just ensure that you specify the sample group assignment with sampclass to the xcmsSet.

ghost commented 6 years ago

Really sorry to bother but I am having problems understanding whats going on. I understand your reasons for deprecating the old functions but unfortunately, the ROPLS tutorial was made using the old functions.The only reason why I need a diff report is because of the way the function arranges data making it extremely easy to conduct statistical analysis including POLS, PCA e.t.c. I have asked Etienne A. Thevenot if he could correct his methods to suit the new xcms functionality but he has not responded. Is there a way of generating this diff report or is there a way of specifying the sample groups because my current code does not work. I need the dataMatrix, sampleMetadata and variableMetadata for statistical analysis and there is no current known way of extracting these values in the new xcms

filepath <- "E:/Data/Kundai/RStudio/XCMS/MZXML/KPCI2" files <- list.files(filepath, pattern = "mzXML", recursive = TRUE, full.names = TRUE) pd <- data.frame(sample_name = sub(basename(files), pattern = ".mzXML", replacement = "", fixed = TRUE), sample_group = c(rep("Eta6", 3), rep("Eta8", 3)), stringsAsFactors = FALSE) raw_data <- readMSData(files = files, pdata = new("NAnnotatedDataFrame", pd), mode = "onDisk")

jorainer commented 6 years ago

You can still use the xcmsSet function and the diffreport on the xcmsSet. The functions are not (yet) deprecated or defunct. Only thing you have to ensure is that sampclass corresponds to your sample grouping (i.e. the sample_group you define). You can set the sampclass in your xcmsSet object with e.g. sampclass(xset) <- sample_group. Just have a look at the help page for sampclass (i.e. ?sampclass). It should be described there.

ghost commented 6 years ago

Thank you for your help. I really appreciate you taking time to help me out. I have managed to process the data as of now.

ghost commented 6 years ago

Really sorry to bother you @jotsetung , but it seems as if you are the only one who can help me and I really appreciate it, I hope I am not abusing it. Anyway, I have managed to make a data matrix but the problem is that it still contains characters and I do not know how to remove them. I asked Etienne Thevenot and he told me that the problem was from XCMS instead of ROPLS but was going to align the new XCMS with ROPLS tutorial.

Pasted below is my code

Initial Data Processing for via XCMS

filepath <- "E:/Data/Kundai/RStudio/XCMS/MZXML/KPCI2" files <- list.files(filepath, pattern = "mzXML", recursive = TRUE, full.names = TRUE) pd <- data.frame(sample_name = sub(basename(files), pattern = ".mzXML", replacement = "", fixed = TRUE), sample_group = c(rep("Eta6", 3), rep("Eta8", 3)), stringsAsFactors = FALSE) raw_data <- readMSData(files = files, pdata = new("NAnnotatedDataFrame", pd), mode = "onDisk") classes <- rep(c("Eta6", "Eta8"), each = 3) xs<-xcmsSet(files, sclass = classes, profmethod = "bin") xset <- group(xs) xset2 <- retcor(xset, family = "symmetric", plottype = "mdevden") xset2 <- group(xset2, bw = 10) xset3 <- fillPeaks(xset2) diffreport <- annotateDiffreport(xset2, quick=TRUE) sampleVc <- grep("^eta6|^eta8", colnames(diffreport), value = TRUE) dataMatrix <- t(as.matrix(diffreport[, sampleVc])) dimnames(dataMatrix) <- list(sampleVc, diffreport[, "name"]) sampleMetadata <- data.frame(row.names = sampleVc,genotypeFc = substr(sampleVc, 1, 2)) variableMetadata <- diffreport[, !(colnames(diffreport) %in% c("name", sampleVc))] rownames(variableMetadata) <- diffreport[, "name"]

ROPLS Data Processing

library(ropls) opls(dataMatrix)

Error: 'x' matrix must be of 'numeric' mode

sneumann commented 6 years ago

You datamatrix comes from diffreport, Which also includes the "name" colum which is character. You need to subset diffreport, or oder groupval() to get the intensity matrix. Yours Steffen


I blame Android for the brevity and typos

---- Nebuchadnezzar schrieb ----

Really sorry to bother you @jotsetunghttps://github.com/jotsetung , but it seems as if you are the only one who can help me and I really appreciate it, I hope I am not abusing it. Anyway, I have managed to make a data matrix but the problem is that it still contains characters and I do not know how to remove them. I asked Etienne Thevenot and he told me that the problem was from XCMS instead of ROPLS but was going to align the new XCMS with ROPLS tutorial.

Pasted below is my code

Initial Data Processing for via XCMS

filepath <- "E:/Data/Kundai/RStudio/XCMS/MZXML/KPCI2" files <- list.files(filepath, pattern = "mzXML", recursive = TRUE, full.names = TRUE) pd <- data.frame(sample_name = sub(basename(files), pattern = ".mzXML", replacement = "", fixed = TRUE), sample_group = c(rep("Eta6", 3), rep("Eta8", 3)), stringsAsFactors = FALSE) raw_data <- readMSData(files = files, pdata = new("NAnnotatedDataFrame", pd), mode = "onDisk") classes <- rep(c("Eta6", "Eta8"), each = 3) xs<-xcmsSet(files, sclass = classes, profmethod = "bin") xset <- group(xs) xset2 <- retcor(xset, family = "symmetric", plottype = "mdevden") xset2 <- group(xset2, bw = 10) xset3 <- fillPeaks(xset2) diffreport <- annotateDiffreport(xset2, quick=TRUE) sampleVc <- grep("^eta6|^eta8", colnames(diffreport), value = TRUE) dataMatrix <- t(as.matrix(diffreport[, sampleVc])) dimnames(dataMatrix) <- list(sampleVc, diffreport[, "name"]) sampleMetadata <- data.frame(row.names = sampleVc,genotypeFc = substr(sampleVc, 1, 2)) variableMetadata <- diffreport[, !(colnames(diffreport) %in% c("name", sampleVc))] rownames(variableMetadata) <- diffreport[, "name"]

ROPLS Data Processing

library(ropls) opls(dataMatrix)

Error: 'x' matrix must be of 'numeric' mode

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/sneumann/xcms/issues/239#issuecomment-346533625, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AAL7OV-kBX2odpHjA0moum0adyqAlpiPks5s5P-cgaJpZM4QhbZ5.

ghost commented 6 years ago

@sneumann Thank you for your timely response. The problem is subsetting would only work if I knew which columns to keep and which ones to discard. Since I do not know which columns I need later on, I cannot use subset. I have tried the following pieces of code and they have not worked:

dataMatrix <-diffreport[, -c("name")] Error in -c("name") : invalid argument to unary operator

dataMatrix[, c("name")]<-list(NULL) Error in [<-(*tmp*, , c("name"), value = list(NULL)) : subscript out of bounds

dataMatrix$name<-NULL Warning message: In dataMatrix$name <- NULL : Coercing LHS to a list opls(dataMatrix) Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘opls’ for signature ‘"list"’

ghost commented 6 years ago

diffreport I removed the name column from the diffreport and I am still having the error. I have taken a look at the dataMatrix and it consists of one row with as many columns as there are rows in the diffreport. Is the problem arising from the fact that one cell consists of two values separated by a forward slash which makes it non-numeric ?

diffreport$name<-NULL dataMatrix <- t(as.matrix(diffreport[, sampleVc])) library(ropls) opls(dataMatrix) Error: 'x' matrix must be of 'numeric' mode

How can I fix the problem ? datamatrix