gogetdata / ggd-recipes

conda recipes for genomic data
MIT License
85 stars 12 forks source link

Mapper indexes, jinja templating and travis matrix #36

Open johanneskoester opened 6 years ago

johanneskoester commented 6 years ago

Hey guys, I have just thought a bit about the mapper index recipes. I think we should define them like this:

meta.yaml

package:
  name: "bwa-index-{{ GENOME_BUILD }}"
  version: "1.0.0"

build:
  noarch: generic

requirements:
  host:
    - bwa
    - {{ GENOME_BUILD }}-sequence
  run:
    - bwa

build.sh

bwa index $ggd_hg38_sequence

conda_build_config.yaml

bwa:
  - 0.7
  - 0.6
pin_run_as_build:
    bwa:
      max_pin: x.x

The last file (a conda build 3 feature, allows to automatically build for different BWA versions and it would pin the index package to the correct version, such that people can not run into the situation where they have an index that does not match the BWA version.

Finally, if we add GENOME_BUILD as a matrix to .travis.yml, we get builds for all the different genomes for free:

env:
  matrix:
    - GENOME_BUILD=hg38
    - GENOME_BUILD=hg19
    - GENOME_BUILD=mm10

This will also help a lot to reduce some redundancy here (because we don't necessary need special recipes for all genomes. In line with this, I also suggest that the tree is extended by recipes/generic. Above bwa-index recipe would then sit in recipes/generic/bwa-index and automatically yield packages like bwa-index-hg38, and bwa-index-mm10, while each is build for both bwa 0.6 and 0.7. All with just one recipe to maintain.

arq5x commented 6 years ago

This seems reasonable to me. Any feedback, @mikecormier @jbelyeu @brentp ?

jbelyeu commented 6 years ago

I believe Mike is actually planning to implement this as soon as he gets some bugs ironed out of the test integration. It does look like a good idea to me for simplification of the structure.

apeltzer commented 6 years ago

Looks reasonable to me too - great that you're working on it! Happy to test things if there is need! @jbelyeu