phnmnl / container-galaxy-k8s-runtime

PhenoMeNal runtime for Galaxy running inside a container orchestrator
Apache License 2.0
5 stars 18 forks source link

Add Eco-Metabolomics #210

Closed korseby closed 6 years ago

korseby commented 6 years ago

Hi all,

please add the Eco-Metabolomics section to Galaxy.

PR is against develop.

The PR includes all tools and the workflow. The tools work out of the box, but import modules need to be adjusted to work with the latest ISA-tools.

This is the initial PR. Please list remaining issues.

Best wishes, *Kristian

ilveroluca commented 6 years ago

Hi @korseby. The new tools load perfectly. There are some things that I don't quite understand though.

ilveroluca commented 6 years ago

In this example I've generalized (I think!) one of the tools, taking out all references to MTBLS520.

Note that if you want to share the specific analysis you performed for the MTBLS520 study, we can assemble a workflow that uses the generic tools but fixes the parameters, which (given the downloader) would also fix the input dataset and thus reproduce the same output result.

korseby commented 6 years ago

Hi @ilveroluca ,

I have updated the naming conventions. I would like to keep the separate mtbls520 importers and extractors until the ISA-Download modules works.

Can you have a look again?

Best wishes, *Kristian

ilveroluca commented 6 years ago

What about these four tools?

korseby commented 6 years ago

These tools import the mtbls dataset. I want to keep them until the ISA-Importer work. Currently, I have some problems creating the dataset collections that contain only the study-mzML and only the QC-mzML in pos/neg-mode. Same thing for the NMR workflow by the way.

sneumann commented 6 years ago

So to get the split study-mzML and only the QC-mzML we would need the "official" mtbls/ISA downloader, the isa-slicer to split by QC/study, and mzML extractors for these two slices. @djcomlab , would that be feasible with the current design of your tools ?

djcomlab commented 6 years ago

@sneumann I've started work on ISAslicer2 to build in the more fully featured querying - query by Factor Values, Material Type and assay type (Measurement/Technology types).

Splitting by QC/study I guess would be on Material Type or similar?

ilveroluca commented 6 years ago

@korseby can we make them generic, so that the user can insert the dataset id as a parameter? (i.e., suppose I wanted to try them on an MTBLS dataset different from MTBLS520)

korseby commented 6 years ago

The preferred method is to use ISA-Slicer and/or one of the other ISA tools. However, before I can test the workflow I need the customized MTBLS520 tools - because there are no other tools that provide that functionality and no other way to create the required dataset collections (because files are too large and manual upload does not work except ftp which is broken). The files are required for the entire workflow and currently I can not replace the above tools with the ISA tools.

You can give a different MTBLS ID than 520 to mtbls520_01_mtbls_download, the tool would download the entire study with a different ID. The mtbls520_02* are rather specific and extract the study files in pos or neg mode (specified as a parameter) and/or the QC files in pos or neg mode (specified as a parameter and by using a different tool). The entire workflow is a rather specific use case and it will not be possible before the end of phnmnl to make the workflow more generic, simply because there are no (published) "Eco-Metabolomics" studies on Metabolights yet.

korseby commented 6 years ago

@djcomlab There are two assays: pos and neg mode. The easiest way would be to parse all files referenced in 'Raw Spectral Data File' and take all files except those that include "blank value" in 'Factor Value[species]' or vice versa. For the workflow, I also need the associated assay file, study file and maf files in the correct polarity.

korseby commented 6 years ago

I updated to version tag of the container for the stable release to v1.1_cv0.1.27.

korseby commented 6 years ago

Hi @ilveroluca , I made the workflow and tools more generic. Please review.

ilveroluca commented 6 years ago

As it stands, it looks like the tools need to be updated to reflect the generalization implemented in these wrappers. For instance, the downloader fails if you try to download anything other than MTBLS520. Also, the tool container should follow our Dockerfile guidelines

korseby commented 6 years ago

Updated as requested.