loosolab / TOBIAS_snakemake

Snakemake pipeline for running TOBIAS analysis
MIT License
3 stars 2 forks source link

Not an issue but requesting help to add additional tools in snakemake yaml file #15

Open c2b2pss opened 3 months ago

c2b2pss commented 3 months ago

Hi,

I am attaching my snakemake config (yaml).

  1. In TOBIAS_snakemake run how many of the tools are run from the tools of the standard TOBIAS? (when I mean tools, I mean those listed in the Wiki, without the "additional" part). It looks like most are run except for the "create network". Is that correct?
  2. How would add a rule/tool to run in the yaml? I am not sure of the exact name to include in yaml? Under "Default module parameters"?

Thanks!


#-------------------------------------------------------------------------#
#-------------------------- TOBIAS input data ----------------------------#
#-------------------------------------------------------------------------#

data:
  WT: [/home/user/TOBIAS_snakemake/bam/LNCAP_WT*.bam]  #list of .bam-files
  PR: [/home/user/TOBIAS_snakemake/bam/LNCAP_PR*.bam]  #list of .bam-files
  CR: [/home/user/TOBIAS_snakemake/bam/LNCAP_CR*.bam]

run_info:
  organism: human                           #mouse/human/zebrafish (used for macs to set "--gsize"; alternatively, set --gsize in macs parameters below)
  fasta: /home/user/Desktop/TFEA_output/hg38.fa      #.fasta-file containing organism genome. NOTE: must be uncompressed .fa or bgzip compressed compatible with samtools
  blacklist: /home/user/Downloads/hg38-blacklist.v2.bed            #.bed-file containing blacklisted regions
  gtf: /home/user/TOBIAS/gencode.v43.annotation.gtf    #.gtf-file for annotation of peaks. NOTE: must be uncompressed .gtf
  motifs: /home/user/Downloads/HOCOMOCOv11_FULL_HUMAN_mono_jaspar_format.txt          #motifs (directory with files or individual files in MEME/JASPAR/PFM format)
  output: /home/user/TOBIAS_snakemake/LNCAP_ALL                      #output directory 
  #peaks: data/merged_peaks_annotated.bed   #optional; pre-calculated annotated peaks
  #peaks_header: data/merged_peaks_annotated_header.txt #optional; header for pre-calculated annotated peaks

#Flags for parts of pipeline to include/exclude (all are True by default)
flags:
  plot_comparison: True #True/False
  plot_correction: True
  plot_venn: True
  coverage: True
  wilson: True

#-------------------------------------------------------------------------#
#----------------------- Default module parameters -----------------------#
#-------------------------------------------------------------------------#

macs: "-f BAMPE --shift -100 --extsize 200 --broad"
atacorrect: ""
footprinting: ""
bindetect: "--time-series"
mohobein commented 3 months ago

Hey,

  1. This workflow shows the TOBIAS modules used in the pipeline. There are a few helper tools that are also not used here, but otherwise, your assumption is correct.
  2. The file you showed is the snakemake config file, which tells the snakemake what input files to use and what parts of the pipeline to run. To modify the pipeline structure itself and add a new rule, you would need to modify one of the .snake files in the snakefiles directory. There, you can add new rules which will be executed if integrated correctly. If you have never worked with snakemake before, I'd recommend looking up a tutorial to understand the concepts and syntax. Then you can just add another rule to run another tool or do another processing step according to your wishes.