galaxyproteomics / tools-galaxyp

Galaxy Tool Shed repositories maintained and developed by the GalaxyP community
MIT License
34 stars 57 forks source link

Omega2 & Sipros3 (Metagenome Assembly & Metaproteome ID/Quant) #88

Open jhervey4 opened 7 years ago

jhervey4 commented 7 years ago

All: I am rather new to this group & would like to introduce open-source applications that I frequently use in our metagenome/metaproteome workflows (primarily on HPC systems):

  1. Omega2: https://bitbucket.org/omicsbio/omega2 & accompanying instructions: http://omega.omicsbio.org/instructions [Purpose: metagenomics assembler which applies a graph-overlap graph theory approach rather than de Bruijn graph theory. Works best for Illumina reads.] [Presently undergoing significant development & may be worth introducing @ any upcoming events] Omega2 data preprocessing prerequisites: a. Sickle: https://github.com/najoshi/sickle b. ecc.sh (an error correction component of BBMap): https://sourceforge.net/projects/bbmap/ & http://jgi.doe.gov/data-and-tools/bbtools/

  2. Canu: https://github.com/marbl/canu (a fork of the Celera Assembler for MinION reads) [Purpose: assembly of Oxford Nanopore Technologies MinION reads; documentation: https://github.com/marbl/canu]

  3. Sipros3: https://github.com/Omics-Bio/Sipros3 [Purpose: Utilizes OpenMPI/MPI for the search of very large FASTA files (eg. those from metagenome assemblies with millions of entries, etc.] [Sipros/ProRata is quite flexibile for integration of protein ID with protein quantification, stable isotope probing, and PTM searches: http://sipros.omicsbio.org/ ] [Under significant development & may also be of interest & any upcoming events.

  4. UniFam: https://github.com/chaij/UniFam [Purpose: Enables large-scale protein annotation with UniProt-based families.]

If any of these are potentially interesting applications to others in the group, please LMK-- I'd be pleased to be able to field questions on the tools and/or work to get them incorporated into the shed.

Thanks!

bgruening commented 7 years ago

@jhervey4 welcome and thanks for your input! For canu we already have created a conda package, so we can start with this one. Anyone wants to learn Galaxy tool dev?

jhervey4 commented 7 years ago

@bgruening I will only be able to attend the upcoming event on Thursday, December 15th-- looking forward to participating then!

bgruening commented 7 years ago

Yeah! This will be fun!

PratikDJagtap commented 7 years ago

@bgruening or @jj-umn - would it be possible look at SIPROS tool from above for its feasibility for packaging within Galaxy?

jhervey4 commented 7 years ago

@bgruening or @jj-umn -- there is a version of Sipros_openmp (version 3) which does not use the MPI/OpenMP feature & also comes as a precompiled binary in the download package: http://sipros.omicsbio.org/software/

I'd like to note that this software is presently under significant modification & it may be prudent (and less complicated) to only include the binary for Sipros_openmp in the toolshed @ this time. If someone could please direct/instruct me on how to do this, I would appreciate it.

In the interim, I can work on writing up the list of pre-requisites (as well as a list of end-user tips & tricks for use) for the installation process.

Thanks!!

bgruening commented 7 years ago

@jhervey4 the new way of creating packages is actually to use bioconda. This will give us a package and a Docker containers. I can help you create such a package. In conda we recommend to build from source to make it executable on old HPC environments. But we also have some packages that gives us binaries.