galaxyproject / tools-iuc

Tool Shed repositories maintained by the Intergalactic Utilities Commission
https://galaxyproject.org/iuc
MIT License
160 stars 421 forks source link

IUC Contribution Fest - Improve QIIME Tool Generator and Tools #426

Closed bgruening closed 8 years ago

bgruening commented 8 years ago

as proposed by @lparsons we will also try to work in QIIME during our 2nd IUC Contribution Fest. This issue will keep track of all QIIME related development. For more information about the Codefest and the initial discussion please see https://github.com/galaxyproject/tools-iuc/issues/299

xref: http://dev.list.galaxyproject.org/QIIME-tools-for-Galaxy-WIP-call-for-collaborators-td4668035.html

ping @lparsons and @pjbriggs

lleroi commented 8 years ago

Hi everyone,

I setted up the 454 qiime workflow on a private Galaxy instance. I used the qiime-galaxy scripts to generate the wrappers. For the scripts used in the 454 workflow, I deleted the tgz in/output option and choose which options can be specified on the xml wrappers.

For the pick_otus.py wrapper, I modified how the clustering options are displayed (eg. swarm options are displayed if swarm is selected for the clustering, uclust options displayed if uclust is selected etc...) I selected 6 of the 12 clustering methods (those are the most interesting for our team) : uclust, uclust_ref, swarm, usearch, usearch_ref and sumaclust.

What do you think about it ?

If it's ok for you, I will make a fork, create a branch, commit the changes and make a pull request.

Laura

bgruening commented 8 years ago

@lleroi I think this is a good start. I will create an empy branch. Can you create your PR against this one here: https://github.com/galaxyproject/tools-iuc/pull/431

I guess we can work all together on this branch then. Thanks a lot!

lparsons commented 8 years ago

Good start. Before we get too far, do we want to:

  1. Update individual wrappers, starting from the output of the Qiime Galaxy tool generation scripts
  2. Update the Qiime Galaxy tool generation scripts

Personally, I prefer option 1 as I think it's more straightforward and will lead more quickly to some useful tools.

In general, the tools created by the Qiime Galaxy scripts package up all the output into tar files, which are not very usable in Galaxy. One option that I've started working on is to split out the most relevant output files from scripts and package up the rest. Another option would be to write wrappers that take the tar files as input. I'm less in favor of that, though it has some appeal since the qiime tools tend to expect the user to run everything in the same directory, reading and writing output to various subdirectories.

bgruening commented 8 years ago

@lparsons I trust your decision here and accept whatever comes first :)

lparsons commented 8 years ago

I'd like to get an Illlumina qiime workflow working in Galaxy.

I've started on this at https://github.com/lparsons/galaxy_tools/blob/qiime/tools/qiime1.9.0/. As I see it, this requires the following things:

The updates to the wrappers have so far involved changing output file handling to allow individual output files to be in history, adding tests and test data, general clearnup (macros, citations, etc.) to make sure they all pass planemo linting and tests run.

lparsons commented 8 years ago

Great, I have a few things that I need to take care of first, but when I get back to this, I'll prepare a PR against this branch so we can start coordinating.

I briefly checked in on IRC, is there a Google Hangout as well?

blankenberg commented 8 years ago

@lparsons hangout at https://hangouts.google.com/hangouts/_/iems2ffkpd4s7cyjoq4mi3hgwma

lparsons commented 8 years ago

Initial work in my fork:https://github.com/lparsons/tools-iuc/tree/illumina-workflow-tools. Let me know if it would help to submit PR now, otherwise I'll continue working here.

lparsons commented 8 years ago

One piece of work that is somewhat isolated, and would be very useful, is to get an initial implementation of biom tools (e.g. summarize-table). See above list.

bgruening commented 8 years ago

@lparsons if you don't mind please submit a PR. So we can coordinate the effort here for tomorrow. @yvanlebras regarding biom tools. I know he is using them already in his workflow so we can probably include his wrapper.

lparsons commented 8 years ago

@bgruening Don't mind at all. PR submitted. Will look into @yvanlebras biom wrappers tomorrow, thanks for the tip.

pjbriggs commented 8 years ago

@lparsons I've started looking at updating your pick_open_reference_otus.py wrapper, along the lines of what it looks like you've already done for count_seqs.py. Please let me know if you're already working on this.

lparsons commented 8 years ago

@pjbriggs I barely started with that yesterday and got pulled away. It'd be great if you could work on that and incorporate the use of biom datatype for the output. I'll see what's going on with biom tools from @yvanlebras. Thanks!

pjbriggs commented 8 years ago

@lparsons I've made a little progress with the pick_open_reference_otus.py wrapper but have run out of time today to do any more. So I've submitted a PR in case you have time to look at it and make suggestions/criticisms (in particular any advice on which outputs should be captured in the history). I'll try and do more tomorrow, but hope this is useful for now.

bgruening commented 8 years ago

@lleroi @pjbriggs @lparsons I guess we need to adopt all tools to macros and execute a few sed commands on top of that. The generator produces not so nice results.

lparsons commented 8 years ago

There are a LOT of tools (and not all of them are that useful). Also, many of the tools have very complicated output (many files, etc.) so it's a bit more than using sed. That's why I decided to focus on a workflow first, to get something useful. Then, using those tools as a template, start updating wrappers for other tools.

bgruening commented 8 years ago

Qiime is worked on in https://github.com/galaxyproject/tools-iuc/pull/431