workflow4metabolomics / tools-metabolomics

Galaxy tools for metabolomics maintained by Workflow4Metabolomics
https://workflow4metabolomics.org/
GNU General Public License v3.0
25 stars 27 forks source link

[camera_annotate] Error in EIC generation #154

Open melpetera opened 5 years ago

melpetera commented 5 years ago

Hi, We had some feedback from an error obtained while running the CAMERA.annotate Galaxy module. The data seems ok, previous fillChromPeak step ran smoothly, and even CAMERA.annotate can be run without problem when you do not ask for EIC plots. But with the following configuration (asking 50 EIC plots), they got an error.


Galaxy Tool Error Report

from https://galaxy.workflow4metabolomics.org/

Detailed Job Information Job environment and execution information is available at the job info page.

Job ID 784882 (5f747013a8f34131)
Tool ID toolshed.g2.bx.psu.edu/repos/lecorguille/camera_annotate/abims_CAMERA_annotateDiffreport/2.2.3
Tool Version 2.2.3
Job PID or DRM id 7486961
Job Tool Version  

Job Execution and Failure Information

Command Line

LC_ALL=C Rscript /work/project/w4m/galaxy4metabolomics/shed_tools/toolshed.g2.bx.psu.edu/repos/lecorguille/camera_annotate/a2c49996603e/camera_annotate/CAMERA.r xfunction annotatediff image '/work/project/w4m/galaxy4metabolomics/database/files/001/660/dataset_1660635.dat' nSlaves ${GALAXY_SLOTS:-1} variableMetadataOutput '/work/project/w4m/galaxy4metabolomics/database/files/001/664/dataset_1664039.dat' dataMatrixOutput '/work/project/w4m/galaxy4metabolomics/database/files/001/664/dataset_1664040.dat' sigma 6 perfwhm 0.6 ppm 15 mzabs 0.015 maxcharge 2 maxiso 3 minfrac 0.5 quick FALSE xsetRdataOutput '/work/project/w4m/galaxy4metabolomics/database/files/001/664/dataset_1664041.dat' cor_eic_th 0.75 graphMethod hcs pval 0.05 calcCiS TRUE calcIso TRUE calcCaS TRUE polarity negative max_peaks 100 multiplier 2 runDiffreport TRUE eicmax 50 eicwidth 200 value into sortpval FALSE h 480 w 640 mzdec 2 convertRTMinute TRUE numDigitsMZ 5 numDigitsRT 2 intval into

stderr

Fatal error: Exit code 1 () arguments 'minimized' and 'invisible' are for Windows only Note: you might want to set/adjust the 'sampclass' of the returned xcmSet object before proceeding with the analysis. Loading required package: Rmpi

Attaching package: 'snow'

The following objects are masked from 'package:BiocGenerics':

clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, clusterSplit, parApply, parCapply,
parLapply, parRapply, parSapply

The following objects are masked from 'package:parallel':

clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, clusterSplit, makeCluster, parApply,
parCapply, parLapply, parRapply, parSapply, splitIndices,
stopCluster

Error in getEIC(object, rtrange = eicwidth 1.1, sampleidx = ceic, groupidx = tsidx[seq(length = eicmax)]) : 25 of the specified 'rtrange' are completely outside of the retention time range of 'object' which is (3.42860080325114, 654.085215660528). The first was: (1042.91060558926, 1262.91060558926! Calls: annotatediff -> diffreport -> getEIC In addition: Warning messages: 1: In xcmsRaw(filepaths(object@xcmsSet)[f], profstep = 0) : There are identical scantimes. 2: Use of 'xcmsClusterApply' is deprecated! Use 'BPPARAM' arguments instead. 3: In data.row.names(row.names, rowsi, i) : some row.names duplicated: 5,29,40,44,45,46,47,48,50,51,52,53,55,56,59,61,62,63,68,86,87,145,282,295,305,306,326,482,500,501,502,503,527,528,582,641,645,646,647,675,676,681,682,683,689,698,701,702,703,704,705,706,767,829,830,833,834,841,842,843,844,845,848,850,852,854,855,948,949,951,952,955,956,957,958,962,964,965,967,976,978,983,984,1074,1077,1094,1116,1123,1126,1131,1177,1199,1269,1282,1320,1334,1385,1401,1416,1419,1420,1421,1423,1424,1425,1495,1496,1497,1498,1503,1514,1519,1520,1521,1558,1559,1560,1561,1571,1577,1582,1611,1621,1623,1624,1639,1650,1682,1683,1684,1685,1686,1687,1689,1691,1701,1707,1708,1711,1712,1713,1716,1720,1744,1745,1761,1863,1979,1980,2021,2022,2024,2027,2028,2029,2039,2040,2041,2042,2043,2044,2045,2046,2049,2050,2139,2140,2153,2161,2241,2243,2244,2247,2255,2256,2281,2333,2399,2418,2422,2423,2427,2429,2430,2501,2570,2588,2621,2622,2623,2635,2677,2680,2744,2745,2756,2757,2780,2781,2782,2848,2855,2875,2877,2879,2881,2884,2945,2983,2992,3002,3004,3009,3013,3014,301 [... truncated] 4: In getEIC(object, rtrange = eicwidth 1.1, sampleidx = ceic, groupidx = tsidx[seq(length = eicmax)]) : NA values in xcmsSet. Use fillPeaks() on the object to fill-in missing peak values. Note however that this will also insert intensities of 0 for peaks that can not be filled in. Execution halted

stdout

SESSION INFO R version 3.4.1 (2017-06-30) Main packages: batch 1.1.4 multtest 2.28.0 CAMERA 1.34.0 xcms 3.0.0 MSnbase 2.4.0 ProtGenerics 1.10.0 mzR 2.12.0 Rcpp 0.12.17 BiocParallel 1.12.0 Biobase 2.38.0 BiocGenerics 0.24.0
Other loaded packages: lattice 0.20.35 digest 0.6.16 foreach 1.4.4 plyr 1.8.4 backports 1.1.2 acepack 1.4.1 mzID 1.16.0 stats4 3.4.1 ggplot2 3.0.0 BiocInstaller 1.28.0 pillar 1.3.0 zlibbioc 1.24.0 rlang 0.2.1 lazyeval 0.2.1 data.table 1.10.4 S4Vectors 0.16.0 rpart 4.1.13 Matrix 1.2.14 checkmate 1.8.5 preprocessCore 1.40.0 splines 3.4.1 stringr 1.3.1 foreign 0.8.71 htmlwidgets 1.2 igraph 1.2.2 munsell 0.5.0 compiler 3.4.1 pkgconfig 2.0.2 base64enc 0.1.3 pcaMethods 1.70.0 htmltools 0.3.6 nnet 7.3.12 tibble 1.4.2 gridExtra 2.3 htmlTable 1.9 RANN 2.6 Hmisc 4.0.3 IRanges 2.12.0 codetools 0.2.15 XML 3.98.1.16 crayon 1.3.4 MASS 7.3.50 grid 3.4.1 MassSpecWavelet 1.44.0 RBGL 1.54.0 gtable 0.2.0 affy 1.56.0 magrittr 1.5 scales 1.0.0 graph 1.56.0 stringi 1.2.4 impute 1.52.0 affyio 1.48.0 doParallel 1.0.11 limma 3.34.9 latticeExtra 0.6.28 Formula 1.2.1 RColorBrewer 1.1.2 iterators 1.0.10 tools 3.4.1 survival 2.42.6 colorspace 1.3.2 cluster 2.0.7.1 vsn 3.46.0 MALDIquant 1.18 knitr 1.20

ARGUMENTS INFO xfunction annotatediff image /work/project/w4m/galaxy4metabolomics/database/files/001/660/dataset_1660635.dat nSlaves 4 variableMetadataOutput /work/project/w4m/galaxy4metabolomics/database/files/001/664/dataset_1664039.dat dataMatrixOutput /work/project/w4m/galaxy4metabolomics/database/files/001/664/dataset_1664040.dat sigma 6 perfwhm 0.6 ppm 15 mzabs 0.015 maxcharge 2 maxiso 3 minfrac 0.5 quick FALSE xsetRdataOutput /work/project/w4m/galaxy4metabolomics/database/files/001/664/dataset_1664041.dat cor_eic_th 0.75 graphMethod hcs pval 0.05 calcCiS TRUE calcIso TRUE calcCaS TRUE polarity negative max_peaks 100 multiplier 2 runDiffreport TRUE eicmax 50 eicwidth 200 value into sortpval FALSE h 480 w 640 mzdec 2 convertRTMinute TRUE numDigitsMZ 5 numDigitsRT 2 intval into

INFILE PROCESSING INFO

ARGUMENTS PROCESSING INFO files_root_directory .

MAIN PROCESSING INFO Starting snow cluster with 4 local sockets. Run cleanParallel after processing to remove the spawned slave processes! Start grouping after retention time. Created 329 pseudospectra. Generating peak matrix! Run isotope peak annotation % finished: 10 20 30 40 50 60 70 80 90 100
Found isotopes: 2957 Start grouping after correlation. Generating EIC's .. Warning: Found NA peaks in selected sample.

Calculating peak correlations in 329 Groups... % finished: 10 20 30 40 50 60 70 80 90 100

Calculating peak correlations across samples. % finished: 10 20 30 40 50 60 70 80 90 100

Calculating isotope assignments in 329 Groups... % finished: 10 20 30 40 50 60 70 80 90 100
Calculating graph cross linking in 329 Groups... % finished: 10 20 30 40 50 60 70 80 90 100
New number of ps-groups: 6819 xsAnnotate has now 6819 groups, instead of 329 Generating peak matrix for peak annotation!

Calculating possible adducts in 6819 Groups... Parallel mode: There are 141 tasks. Sending task # 1 Sending task # 2 Sending task # 3 Sending task # 4 [...] (I made it shorter for space reasons) Sending task # 138 Sending task # 139 Sending task # 140 Sending task # 141

Job Information None

Job Traceback None

This is an automated message. Do not reply to this address.

lecorguille commented 4 years ago
NA values in xcmsSet. Use fillPeaks() on the object to fill-in missing peak values. Note however that this will also insert intensities of 0 for peaks that can not be filled in.

Did you try the option Replace the remain NA by 0 in the dataMatrix in xcms fillChromPeaks ?

melpetera commented 4 years ago

The thing is this option only concerns the dataMatrix output. Previously fillPeaks provided 0 values instead of NA I guess in the xcms object, but now with findCromPeaks when it is not able to fill the NA it leaves it as NA in the xcms object. This allows an iterative use of fillChromPeaks (which is cool), but it is a limitation for CAMERA's EIC that does not want to run if there are NA in the xcms object.

I guess we should ping the CAMERA R package developpers.

yguitton commented 1 year ago

Hi all,

A little up on that issue that seems to be link to the forum message https://community.france-bioinformatique.fr/t/probleme-creation-eic-camera/2722/2?u=yguitton

Did someone already found a way to deal with such error? Best Yann

melpetera commented 1 year ago

As the problem would be NA values, in fact there are at least two different ways to reformulate the problem:

I highlight that because maybe in the second case it would be possible to "just" add in the CAMERA Galaxy wrapper a step to convert NA to 0 in the xcms object in case the user ask for EIC? I am not very familiar with these object, but maybe it is something that could be manually replaced?