Closed hudja closed 1 year ago
I have many different snakemake workflows, but I am using the same profile for all of them, so they all are logging into the same folder.
Could you please help me better understand how you are organizing your projects? The logs are written to the current working directory. Are you executing all of these Snakefiles from the same directory? How do A.snakefile
and B.snakefile
relate to each other? Are they subsequent steps or are they processing completely different data?
Hi,
I have a master project folder with multiple sub-folders: QCed data, phased data, imputed data, etc. This is a biobank level data and each step takes up to a week to complete. So, I prefer to manually run different snakefiles, rather than put everything in a single pipeline. I start with QCing my data with say qc.snakefile
, after it completes I continue with phase.snakefile
, and after the completion, I proceed with impute.snakefile
, and so on. Each of these steps will generate an output for the next step. My snakefiles do not invoke each other upon completion. I rather check if everything is fine and start next step manually. The data is also processed for multiple genome builds and individual chromosomes separately, which adds more log files. Each of snakefiles has 10-15 steps, so currently my logging folder will have 50-100 rule sub-folders. It would be really helpful to log rules from each snakefile into its own log/
folder, but as the majority of my rules use the same resources, I use a single default profile for all of them, and therefore all logs go into the same log/
folder. My config.yaml
is located in the same log
folder (log/config.yaml
).
Ther perfect solution would be smth like:
log/qc/rule1, ..., log/qc/rule10
log/phase/rule1, ..., log/phase/rule10
log/impute/rule1, ..., log/impute/rule10
log/config.yaml
So, that I can run it with snakemake -s snakefile --profile log/config.yaml
.
Of course, I can use different profile
folders and config.yaml
files for each snakefile, but the whole pipeline is made of up to 10 different snakefiles. So, using 10 different profile folders for each of them is less convenient.
Alternatively, I can rename all my rules in all my snakefiles and add respective prefixes to them, e.g. rule qc_rule1
, rule phase_rule1
, etc., but I thought that maybe there is a simplier workaround, so that I do not need to change my code too much: an additional variable that can be used, whether defined in profile
, snakefile
itself or as --config log_folder=X
option. I hope I did not confuse you too much!
Thank you!
Got it. Check out the example I created, shared-logs
$ snakemake --profile shared/ --snakefile qc.snakefile
$ snakemake --profile shared/ --snakefile impute.snakefile
$ snakemake --profile shared/ --snakefile phase.snakefile
$ ls logs/
impute.snakefile phase.snakefile qc.snakefile
If you really wanted to remove the file extension .snakefile
, you could pipe it to sed
, tr
, or some other command-line tool.
It may be possible to use --config
or --envvars
, but I didn't investigate these options since it would require you to remember to include this when you ran snakemake
, which is easy to forget. By using workflow.main_snakefile
, it will always include the name of the Snakefile in the path for the log files.
@hudja Did you get a chance to try out my shared-logs example? Does this work well for your use case? Please let me know if you have any problems implementing it
Yes, sorry for not answering. I thought I cannot for the closed issue. It is working perfectly as expected! Thank you very much!
Hi,
Thank you for the great documentation, it is really helpful! I have one question.
I have many different snakemake workflows, but I am using the same profile for all of them, so they all are logging into the same folder. Is it possible to add sub-folder to the log folder based on snakefile name? So, that it would be easier to understand what rule folder belongs to what snakefile? Smth like:
--output=logs/{SNAKEFILE_NAME}/{rule}/{rule}-{wildcards}-%j.out
so that my logs folder will be organized like: logs/A.snakefile/rule1/ logs/B.snakefile/rule1/
Thanks!