NBISweden / aMeta

Ancient microbiome snakemake workflow
MIT License
19 stars 15 forks source link

Move module configuration to a separate config file #41

Closed clami66 closed 2 years ago

clami66 commented 2 years ago

As I adapt the workflow to Kebnekaise, I can see that the config file is becoming quite long because of the envmodules section:

envmodules:
  FastQC_BeforeTrimming:
    - FastQC/0.11.9-Java-11
  FastQC_AfterTrimming:
    - FastQC/0.11.9-Java-11
  MultiQC_BeforeTrimming:
    - GCC/10.2.0
    - OpenMPI/4.0.5
    - MultiQC/1.10.1
  MultiQC_AfterTrimming:
    - GCC/10.2.0
    - OpenMPI/4.0.5
    - MultiQC/1.10.1
  MultiQC:
    - GCC/10.2.0
    - OpenMPI/4.0.5
    - MultiQC/1.10.1      
  Cutadapt_Adapter_Trimming:
    - GCCcore/10.2.0
    - cutadapt/3.4
  Bowtie2_Index:
    - GCC/10.2.0
    - Bowtie2/2.4.2
  Bowtie2_Pathogenome_Alignment:
    - GCC/10.2.0
    - Bowtie2/2.4.2
    - SAMtools/1.12
  MapDamage:
    - GCCcore/10.2.0    
    - parallel/20210322
    - GCC/10.2.0
    - OpenMPI/4.0.5
    - R/4.0.4
    - mapDamage/2.2.1-R-4.0.4
    - SAMtools/1.12
  KrakenUniq:
    - GCC/10.2.0
    - KrakenUniq/0.6
  Filter_KrakenUniq_Output:
    - GCC/10.2.0
    - OpenMPI/4.0.5
    - R/4.0.4
  KrakenUniq2Krona:
    - GCC/10.2.0
    - OpenMPI/4.0.5
    - R/4.0.4
    - GCCcore/10.2.0
    - KronaTools/2.8
  KrakenUniq_AbundanceMatrix:
    - GCC/10.2.0
    - OpenMPI/4.0.5
    - R/4.0.4
  Build_Malt_DB:
    - GCC/10.2.0
    - seqtk/1.3
  Malt_AbundanceMatrix:
    - GCC/10.2.0
    - OpenMPI/4.0.5
    - R/4.0.4
  Authentication:
    - GCC/10.2.0
    - OpenMPI/4.0.5
    - R/4.0.4
    - GCC/10.2.0
    - seqtk/1.3
    - SAMtools/1.12

I believe that this part of the configuration should be moved to an other YAML file so that the same project-specific config file can be used in either environment (e.g. UPPMAX or Kebnekaise) without changes, while the new system-specific YAML can be written once and then "forgotten"

percyfal commented 2 years ago

Yes, this is a good idea! I suggest then also to write a separate modules schema file and load modules config conditional on its existence.

percyfal commented 2 years ago

@clami66 For reference I add a summary of our slack discussion here. Given that snakemake supports multiple configuration files, all that is needed is an additional configfile statement in common.smk; the end result is a merged configuration from both files.