Open 540889956 opened 1 month ago
Hi WJ,
Glad to hear it works :)
The CLI for bgcflow run
is just a simple wrapper of the snakemake
CLI. So you can always directly use snakemake
and use whatever parameter is available in snakemake
documentation.
If you prefer to use the bgcflow_wrapper
CLI, you can check what parameter is available using the help command, such as:
$ bgcflow run --help
Usage: bgcflow run [OPTIONS]
A snakemake CLI wrapper to run BGCFlow. Automatically run panoptes.
Options:
-d, --bgcflow_dir TEXT Location of BGCFlow directory. (DEFAULT: Current
working directory.)
--workflow TEXT Select which snakefile to run. Available
subworkflows: {BGC | Database | Report | Metabase |
lsagbc | ppanggolin}. (DEFAULT: workflow/Snakefile)
--monitor-off Turn off Panoptes monitoring workflow. (DEFAULT:
False)
--wms-monitor TEXT Panoptes address. (DEFAULT: http://127.0.0.1:5000)
-c, --cores INTEGER Use at most N CPU cores/jobs in parallel. (DEFAULT:
8)
-n, --dryrun Test run.
--unlock Remove a lock on the snakemake working directory.
--until TEXT Runs the pipeline until it reaches the specified
rules or files.
--profile TEXT Path to a directory containing snakemake profile.
-t, --touch Touch output files (mark them up to date without
really changing them).
-h, --help Show this message and exit.
Note that the current bgcflow_wrapper
package is using an older snakemake
version and we are currently working on the update.
Also, there is another question. After bcgflow build report, the code '# use conda or mamba mamba env create -f bgcflow_notes.yaml # or r_notebook.yaml' doesn't work
I hope this means the command bgcflow build report
works and you want to manually edit the notebook templates?
By snakemake
convention, the environment files can be found in workflow/envs/<environment name>.yaml
. Therefore, you can create the conda environment using mamba env create -f workflow/envs/bgcflow_notes.yaml
.
You can actually reuse the conda environment built by snakemake by checking the snakemake log. They can be found in the .snakemake/conda
folder.
PS: If you find any misleading or wrong instruction in the WIKI, please do let us know to correct it.
Hi,
Thanks very much for the help.
I got another problem when the workflow install the env for roary. shows below: Output: Channels:
LibMambaUnsatisfiableError: Encountered problems while solving:
I can use roary in individual conda env , but when I export the yaml to replace the yaml in the workflow it will generate new errors. So may I ask how to modify the yaml to solve this?
Thanks for the help!
Best Regards, Jay
Hi Jay,
Unfortunately I cannot reproduce the error for creating the roary environment and the test seems to work fine.
From the message: package r-ggplot2-3.3.6-r42h6115d3f_0 is excluded by strict repo priority
, it seems that you have your conda channel priority to strict
.
Can you check your conda channel priorities and set it to flexible
to see if it solves the problem?
A detailed instruction is available in the wiki
After setting the priority to flexible, while running the snakemake jobs, you should see this warning
message nagging you about it, which is fine:
Assuming unrestricted shared filesystem usage.
Building DAG of jobs...
Your conda installation is not configured to use strict channel priorities. This is however crucial for having robust and correct environments (for details, see https://conda-forge.org/docs/user/tipsandtricks.html). Please consider to configure strict priorities by executing 'conda config --set channel_priority strict'.
Creating conda environment workflow/envs/roary.yaml...
Downloading and installing remote packages.
Environment for /datadrive_cemist/test/workflow/rules/../envs/roary.yaml created (location: .snakemake/conda/b39a961a250810ddef5ab2698703b6ab_)
Using shell: /usr/bin/bash
Provided cores: 4
Rules claiming more threads will be scaled down.
Singularity containers: ignored
We will probably remove roary with other newer pangenome builder. Hopefully there will be support to use singularity containers in the future for better reproducibility.
Hi, thanks for reaching out and using BGCFlow :)
I just wish to know how to use Custom input directory
To use custom input directory, you will need to set up two things:
project_config.yaml
)samples.csv
Please find the example here: config.zip
This is how the project structure will look like:
And this is how the project configuration (
project_config.yaml
) looks like:and finally, you need to add the custom sample in the
samples.csv
:and you should get this message:
Why this workflow can still running after I delete all the input files?
I would assume that you don't change anything in the example template configuration and only deleted the default input files located in
data/raw
. If this is the case, then the project can still run because all the samples are being fetched online from NCBI. You can check if this is the case from thesamples.csv
Thank you again for the question, we will be sure to add this to the FAQ section and improve the WIKI.