Open yjk-bertrand opened 3 months ago
Hi Yann,
Thanks for your interest. miniprot-boundary-scorer is already available as part of the GALBA container image. See here https://github.com/Gaius-Augustus/GALBA?tab=readme-ov-file#singularity-image, you can invoke it with
singularity exec galba.sif miniprot_boundary_scorer
Would using it like that work for you?
If not, I'll look into making a bioconda recipe.
Best, Tomas
Hi Tomáš, Thanks for your prompt answer. Since you are proposing I will heartfully take your offer to make a bioconda recipe. In the context of our Snakemake pipeline it is certainly not convenient to pull that docker image everywhere it needs to run. Cheers, Yann
It is super easy to pull and run containers in snakemake. I do it all the time. Writing from my phone, I can send you an example rule from my computer next week. Really easy!
yjk-bertrand @.***> schrieb am Fr. 16. Aug. 2024 um 10:02:
Hi Tomáš, Thanks for your prompt answer. Since you are proposing I will heartfully take your offer to make a bioconda recipe. In the context of our Snakemake pipeline it is certainly not convenient to pull that docker image everywhere it needs to run. Cheers, Yann
— Reply to this email directly, view it on GitHub https://github.com/tomasbruna/miniprot-boundary-scorer/issues/4#issuecomment-2293025715, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJMC6JBK2BL73GO55OPCXSTZRWWXZAVCNFSM6AAAAABMRYZEDOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOJTGAZDKNZRGU . You are receiving this because you are subscribed to this thread.Message ID: @.***>
In that case, we can give it a try. Looking forwards to your tips. Thanks!
This is a random rule from one of my snakefiles. You see how I specify the container source under "singularity". Everything executed in the shell will be run with that container. miniprot-boundary-scorer resides in the GALBA container, not in the braker container as shown here. (This particular rule requires snakemake-executor-plugin-slurm because I execute it via SLURM, but you may not need that part.)
rule run_sam_to_bam:
input:
fastqdump_lst = "data/checkpoints_dataprep/{taxon}_B06_rnaseq_for_fastqdump.lst",
remove_bad_done = "data/checkpoints_dataprep/{taxon}_B06_remove_bad_libraries.done"
output:
done = "data/checkpoints_dataprep/{taxon}_B06_sam2bam.done"
params:
taxon=lambda wildcards: wildcards.taxon,
threads = config['SLURM_ARGS']['cpus_per_task']
wildcard_constraints:
taxon="[^_]+"
singularity:
"docker://teambraker/braker3:latest"
threads: int(config['SLURM_ARGS']['cpus_per_task'])
resources:
mem_mb=int(config['SLURM_ARGS']['mem_of_node']),
runtime=int(config['SLURM_ARGS']['max_runtime'])
shell:
"""
export APPTAINER_BIND="${{PWD}}:${{PWD}}"; \
log="data/checkpoints_dataprep/{params.taxon}_B06_sam2bam.log"
echo "" > $log
readarray -t lines < <(cat {input.fastqdump_lst})
for line in "${{lines[@]}}"; do
# Replace the first space with an underscore in the species name part of the line
modified_line=$(echo "$line" | sed 's/\\([^\\t]*\\) /\\1_/')
species=$(echo "$modified_line" | cut -f1)
sra_ids=$(echo "$modified_line" | cut -f2)
IFS=',' read -r -a sra_array <<< "$sra_ids"
for sra_id in "${{sra_array[@]}}"; do
if [ ! -f "data/species/$species/hisat2/${{sra_id}}.bam" ] && [ -f data/species/$species/hisat2/${{sra_id}}.sam ]; then
echo "samtools view --threads {params.threads} -bS data/species/$species/hisat2/${{sra_id}}.sam > data/species/$species/hisat2/${{sra_id}}.bam" &>> $log
samtools view --threads {params.threads} -bS data/species/$species/hisat2/${{sra_id}}.sam > data/species/$species/hisat2/${{sra_id}}.bam 2>> $log
else
echo "data/species/$species/hisat2/${{sra_id}}.bam already exists" &>> $log
fi
done
done
touch {output.done}
"""
Important is that singularity needs bindings to access your data. I configure the bindings like this:
Create a file: ~/profile/apptainer/config.v8+.yaml
Add the following content to the file (adapt to your own working directory):
use-singularity: True
singularity-args: "\"--bind /home/xy/git/braker-snake:/home/xy/git/braker-snake --bind /home/xy/ncbi:/home/xy/ncbi\""
You have to adapt to your own directories, of course.
To run the snakefile in the end, include the option --use-apptainer
.
Additional information: when you execute any snakemake workflow like this, the container pulling will take time at the first run. But for follow-ups, it will re-use the already pulled containers.
Hi @KatharinaHoff and @tomasbruna,
I am Simón and I work with Yann in the development of the pipeline.
I appreciate the tips about singularity usage. They may come in handy in the future.
Our pipeline currently relies only on pip (and python packages) and conda/mamba as dependencies. I would be reticent to add singularity as an extra dependency if we can avoid it. I prefer to put a bit of extra effort into the developer's shoulders to make it easier for the user.
Making a bioconda recipe is just as easy. I am willing to help to set it up if you prefer. From what I can see from the guidelines (which complement the instructions of the previous link), the really important thing we are missing is a stable URL (i.e. a tarball).
@tomasbruna, could you please make a GitHub release of miniprot-boundary-scorer
?
The second thing that is not clear to me are the dependencies. Is it just make
and the C++ compatible compiler? or are there other libraries needed that I missed?
Thank you for the help.
Hi @sivico26,
I've added the release. I've removed the test
folder from the release tarball, so it's pretty compact.
If you could set the recipe with that, it would be great.
Is it just make and the C++ compatible compiler?
Correct, there are no other dependencies.
Happy to inform you that the PR for the package I made to bioconda was merged earlier today. Hence, I think the package for miniprot-boundary-corer
will be available soon in bioconda.
Very cool, thanks!
Hello, Thank you for providing miniprot-boundary-scorer to the community, the tool is really nice. We wish to include it in our snakemake pipeline. Having to compile a software means that we need to create a container, which is not part of our initial plan. Would it be possible to make a bioconda recipe? Cheers, Yann