Closed lovettse closed 5 years ago
This is an excellent feature request! 👍
I think the generation of a final report will also address Issue #9, in that we can add a single "final_report" rule that will run all of the dependent rules required to populate the final report with data. That will make the snakemake workflow execution command significantly shorter.
Changes needed for version 1.2
snakemake --use-singularity post_processing_move_samples_dir_workflow
workflows/data/<samplename>
directory circumventing the primary .gitignore
directive to ignore the data directory, while adding a lower-level .gitignore
to ignore large filesCandidate .gitignore
for a completed workflow. This will allow you to commit the files needed but ignore most of the stuff from prokka and assembler intermediates.
*metaspades/
*megahit/
*.err
*.faa
*.ffn
*.fna
*.fsa
*.gbk
*.gff
*.sqn
*.tbl
*.gz
You can view the updated report by either:
workflows/data/SRR606249_subset10_1_reads_finished/0-summary-report.html
. A few notes:
snakemake --use-singularity post_processing_move_samples_dir_workflow
. In this case, the file lives at workflows/data/SRR606249_subset10_1_reads_finished/0-summary-report.Rmdparams$id
. (More info...). E.g.:
params:
id: "SRR606249_subset10_1_reads"
post_processing_move_samples_dir_workflow
workflow, could create the Rmarkdown source, replacing the parameter above with the sample ID, and then issuing the rmarkdown::render()
function to create the final report.cc @cgrahlm @kternus
Note the root level .gitignore and the .gitignore in the workflows/data/SRR606249_subset10_1_reads_finished directory.
This part of the root-level gitignore allows you to ignore everything in the data directory except the SRR606249_subset10_1_reads_finished
where the example dataset was run and where the final report goes.
# Ignore data/ dirs
workflows/read_filtering/data
workflows/taxonomic_classification/data
# Don't blacklist workflows/data recursively
!workflows/data/
# Ignore everything under workflows/data
workflows/data/*
# Except this particular directory and everything under it
!workflows/data/SRR606249_subset10_1_reads_finished/
!workflows/data/SRR606249_subset10_1_reads_finished/*
The .gitignore in the workflows/data/SRR606249_subset10_1_reads_finished directory ensures you don't commit gigabytes of data, only committing files needed for report generation and other negligible size files.
A final report, maybe just an html document with links to the various outputs with a brief explanation as to what they are would be very helpful.