workflow4metabolomics / tools-metabolomics

Galaxy tools for metabolomics maintained by Workflow4Metabolomics
https://workflow4metabolomics.org/
GNU General Public License v3.0
25 stars 27 forks source link

Run xcmsSet on each input file individually #2

Closed sneumann closed 7 years ago

sneumann commented 8 years ago

Then Galaxy can parallelize the execution. For backwards compatibility, check at https://github.com/workflow4metabolomics/xcms/blob/master/src/xcms_w4m_script/xcms.r#L122 if a ZIP or an mzML/netCDF/... is provided.

Then you get one xcmsSet per input file. After all N xcmsSets are created, use " c(xs1, xs2, xs3, ..." to combine all individual xcmsSets into one big, like the existing node does. Yours, Steffen

lecorguille commented 8 years ago

We will be able to carry on new developments since planemo shed_test for all tools in this repo https://github.com/workflow4metabolomics/xcms/issues/3

lecorguille commented 8 years ago

Let's go! https://github.com/workflow4metabolomics/xcms/pull/13

lecorguille commented 7 years ago

In progress: https://github.com/workflow4metabolomics/xcms/pull/13 Need to be tested

lecorguille commented 7 years ago

Available from the testtoolshed since the version 2.1.0 docker run -d -p 8080:80 quay.io/workflow4metabolomics/galaxy-workflow4metabolomics:beta

The CAMERA part will fail. I'm working on that https://github.com/workflow4metabolomics/camera/issues/15

lecorguille commented 7 years ago

Available within the dev galaxy instance

lecorguille commented 7 years ago

Unfortunatly, we can't currently import CDF file in the dev instance because it need to be updated. By can you try with some other format to:

melpetera commented 7 years ago

Can't manage to make the merge work:

Tool execution generated the following error message:

Fatal error: Exit code 1 ()
Warning message:
'loadRcppModules' is deprecated.
Use 'loadModule' instead.
See help("Deprecated") 
Error in peaklist[[i]][, "sample"] : subscript out of bounds
Calls: c -> c.xcmsSet
Execution halted

The tool produced the following additional output:

  XSET MERGING...
QC_1 
QC_2 
sneumann commented 7 years ago

Need more debugging information. The warning is from package mzR and Rcpp, but should (?!) still work. If not, I need to add a sanity check to c.xcmsSet(). In which R file is this c(xs1, xs2, ...) called ?

lecorguille commented 7 years ago

@melpetera I observed this behaviour when I merge MM8 and MM14 obtained with the default parameters of xcmsSet. I got this kind of xset:

    XSET OBJECT INFO
An "xcmsSet" object with 1 samples

Time range: Inf--Inf seconds (Inf--Inf minutes)
Mass range: Inf--Inf m/z
Peaks: 0 (about 0 per sample)
Peak Groups: 0 
Sample classes: .
lecorguille commented 7 years ago

@sneumann Here

melpetera commented 7 years ago

Oh right! I didn't notice... I'll try to change my initial xcmsSet parameters.

lecorguille commented 7 years ago

Meanwhile, I should raise and error because currently this kind of result is green

melpetera commented 7 years ago

Now that my xcmsSet output files are good, the whole workflow works perfectly!

lecorguille commented 7 years ago

ping at least @melpetera @yguitton

After some nocturnal reflections, I think I should simplified the form around the inputs.

Currently:

      <conditional name="inputs">
            <param name="input" type="select" label="Choose your inputs method" >
                <option value="zip_file" selected="true">Zip file from your history containing your chromatograms</option>
                <option value="single_file">mzXML file from your history</option>
            </param>
            <when value="zip_file">
                <param name="zip_file" type="data" format="no_unzip.zip,zip" label="Zip file" />
            </when>
            <when value="single_file">
                <param name="single_file" type="data" format="mzxml,netcdf" label="Single file" />
            </when>
      </conditional>

So users have to choose the input method between zip_file or single_file (mzxml, ...) and then to select their input in their history. What do you think about accept in the same param both zip file and individual files?

<param name="file" type="data" format="no_unzip.zip,zip,mzxml,mzml,netcdf,mzdata" label="Single file" />

If you agree with that, unfortunatly, at some point, I will ask you to test again the whole workflow.

lecorguille commented 7 years ago

Pong: @bgruening

bgruening commented 7 years ago

ping! :) I'm definitely in favor of the single input without a choice. So that you put everything in one format attribute. This should be a much better UX and less code :) In a long run I would even try to remove the zip from the supported file types.

Stupid question what happens if in the zip are 10 files?

lecorguille commented 7 years ago

Done: https://github.com/workflow4metabolomics/xcms/pull/44

Available within the dev galaxy instance

lecorguille commented 7 years ago

@yguitton The datatype netCDF and mzData are now supported within the galaxydev instance Try again! :)

melpetera commented 7 years ago

Tested on dev galaxy instance:

sneumann commented 7 years ago

Cool! What is the link to the dev galaxy instance ?

yguitton commented 7 years ago

OK for xcmSet OK single and multiple mzXML OK for single & multiple CDF in a zip OK for single & multiple mzData in a zip

Note still accent issue if some are present inside mzXML files

lecorguille commented 7 years ago

@sneumann The URL is https://galaxydev.workflow4metabolomics.org If you have trouble to login, let me know :)

lecorguille commented 7 years ago

Validated 👍 Thanks to my testers

lecorguille commented 7 years ago

To really close this thread

Here is a little screencast about how to run in parallel xcmsSet within W4M using single files [link]

jfrancoismartin commented 7 years ago

Sorry for the delay. I made some mistakes in my tests... Everything is ok for me now.