Open hgingras opened 3 months ago
Hi, you can run the cmd like nextflow run epi2me-labs/wf-transcriptomes
instead of nextflow run main.nf
to ensure it runs with the correct version of the container and let me know if you still get the same errors
Hi sarahjeeeze,
When running : nextflow run epi2me-labs/wf-transcriptomes --help
I get this error:
There is insufficient memory for the Java Runtime Environment to continue. Native memory allocation (malloc) failed to allocate 24 bytes for AllocateHeap
Our environment is an HPC system where there is limited memory on the login node. Also, we have only access to internet on the login node, so I cannot run in the compute node to get the workflow.
So I am limited to download the workflow this way:
wget https://github.com/epi2me-labs/wf-transcriptomes/archive/refs/tags/v1.1.1.tar.gz tar -xvf v1.1.1.tar.gz
In this version (the last one), in the nextflow.config there is this specification: container_sha = "shae7c9f184996a384e99be68e790f0612f0c732867"
This is the image I loaded doing so:
module load StdEnv/2023 apptainer/1.2.4 nextflow/23.10.0 mkdir -p /scratch/$USER/apptainer/{cache,tmp} export APPTAINER_CACHEDIR="/scratch/$USER/apptainer/cache" export APPTAINER_TMPDIR="/scratch/$USER/apptainer/tmp"
apptainer pull docker://ontresearch/wf-transcriptomes:shae7c9f184996a384e99be68e790f0612f0c732867
As mention in previous message when looking in this .sif image with this command, I see that fastcat version is an old one that do not have the --histograms option that is specified in the lib/ingress.nf file in fastcat process.
apptainer run wf-transcriptomes_shae7c9f184996a384e99be68e790f0612f0c732867.sif Apptainer> pwd /home/epi2melabs/conda/bin Apptainer> fastcat -V 0.10.2
got same with image wf-transcriptomes_latest.sif
I tried to run with old version of wf-transcriptomes-1.0.0 where the lib/ingress.nf do not have the --histograms option but then I got that csvtk: command not found. Here I do not see csvtk in conda environment.
Could you have a look on your side at the version of fastcat that is available in the last .sif image that you provide?
It should be upgraded to 0.16.0.
Could you also add csvtk in the conda environment?
Best regards,
Helene
Hi,
I ran into a similar issue where fastcat did not have the --histograms
option. I also copy the git repo and run the main.nf
file. I use docker instead of apptainer. I run the pipeline on a local LSF cluster instead of slurm.
For running nextflow workflows I create a profile that can utilize the LSF cluster. I set the default docker image to ontresearch/wf-transcriptomes:${params.wf.container_sha}
then I ran into the --historgrams
issue you faced.
I was able solve this problem by utilizing the labels for each step.
I added this to the nextflow.config:
...
profiles {
// the "standard" profile is used implicitely by nextflow
// if no other profile is given on the CLI
compute1_lsf {
process.executor = 'lsf'
process.queue = 'general'
process {
withLabel:isoforms {
clusterOptions = "-G compute-mylab -a 'docker(ontresearch/wf-transcriptomes:${params.wf.container_sha})'"
}
withLabel:wf_common {
clusterOptions = "-G compute-mylab -a 'docker(ontresearch/wf-common:${params.wf.common_sha})'"
}
}
}
...
then added -profile compute1_lsf
to the nextflow run command.
Not sure how to do that for your slurm cluster or utilizing apptainers. Just wanted to hopefully offer a solution!
Thanks for sharing. I ended up using local mode and setting up the requirements in a python virtual environment and installing other modules by myself. Only the jaffal module I could not set up. I wished I had a reply to understand more about the docker image and version of the different modules. Have a good one!
Hi, I ran into a similar issue where fastcat did not have the
--histograms
option. I also copy the git repo and run themain.nf
file. I use docker instead of apptainer. I run the pipeline on a local LSF cluster instead of slurm.For running nextflow workflows I create a profile that can utilize the LSF cluster. I set the default docker image to
ontresearch/wf-transcriptomes:${params.wf.container_sha}
then I ran into the--historgrams
issue you faced. I was able solve this problem by utilizing the labels for each step.I added this to the nextflow.config:
... profiles { // the "standard" profile is used implicitely by nextflow // if no other profile is given on the CLI compute1_lsf { process.executor = 'lsf' process.queue = 'general' process { withLabel:isoforms { clusterOptions = "-G compute-mylab -a 'docker(ontresearch/wf-transcriptomes:${params.wf.container_sha})'" } withLabel:wf_common { clusterOptions = "-G compute-mylab -a 'docker(ontresearch/wf-common:${params.wf.common_sha})'" } } } ...
then added
-profile compute1_lsf
to the nextflow run command. Not sure how to do that for your slurm cluster or utilizing apptainers. Just wanted to hopefully offer a solution!
@apaul7, Thank you for sharing your case. Could you tell me what's the difference between default and your profile? I could not find any particular changes between your profile and default configuration.
Hi,
I've added the git diff from 999fb4e using git diff nextflow.config
:
index 4eb4c73..8e5874b 100644
--- a/nextflow.config
+++ b/nextflow.config
@@ -140,6 +140,18 @@ process {
profiles {
// the "standard" profile is used implicitely by nextflow
// if no other profile is given on the CLI
+ compute1_lsf {
+ process.executor = 'lsf'
+ process.queue = 'general'
+ process {
+ withLabel:isoforms {
+ clusterOptions = "-G compute-mylab -a 'docker(ontresearch/wf-transcriptomes:${params.wf.container_sha})'"
+ }
+ withLabel:wf_common {
+ clusterOptions = "-G compute-mylab -a 'docker(ontresearch/wf-common:${params.wf.common_sha})'"
+ }
+ }
+ }
standard {
docker {
enabled = true
When submitting jobs to my cluster via bsub
you need to provide a docker image using the application(-a
) option. This compute1_lsf profile allows nextflow to use different docker images depending on the label in the individual step.
Hope this helps! -Alex
@hgingras
When you were doing this:
In this version (the last one), in the nextflow.config there is this specification: container_sha = "shae7c9f184996a384e99be68e790f0612f0c732867"
This is the image I loaded doing so:
you would have needed to pull another container image also. The workflow use two images: nextflow.config, it is the wf-common
image which is used to run the steps involving fastcat
(I'm not entirely sure why fastcat
is installed into the wf-transcriptomes image also, it might be historical).
Operating System
Other Linux (please specify below)
Other Linux
NAME="Rocky Linux"
Workflow Version
v1.1.1
Workflow Execution
Other (please describe)
EPI2ME Version
No response
CLI command run
!/bin/bash
SBATCH --account=def-user
SBATCH --cpus-per-task=16
SBATCH --mem=32G
SBATCH --time=0-01:00
module load StdEnv/2023 apptainer/1.2.4 nextflow/23.10.0
export NXF_SINGULARITY_CACHEDIR="/scratch/$USER/apptainer/cache" export APPTAINER_TMPDIR="/scratch/$USER/apptainer/tmp" export APPTAINER_BIND="/lustre05,/lustre06,/lustre07,/scratch,/project"
nextflow run main.nf \ --fastq /home/helene/scratch/Ticket/wf-transcriptomes/wf-transcriptomes-1.1.1-no-1/differential_expression/differential_expression_fastq \ --de_analysis --ref_genome /home/helene/scratch/Ticket/wf-transcriptomes/wf-transcriptomes-1.1.1-no-1/differential_expression/hg38_chr20.fa \ --transcriptome-source reference-guided \ --ref_annotation /home/helene/scratch/Ticket/wf-transcriptomes/wf-transcriptomes-1.1.1-no-1/differential_expression/gencode.v22.annotation.chr20.gtf \ --direct_rna --minimap2_index_opts '-k 15' --sample_sheet /home/helene/scratch/Ticket/wf-transcriptomes/wf-transcriptomes-1.1.1-no-1/differential_expression/sample_sheet.csv \ --jaffal_refBase /home/helene/scratch/Ticket/wf-transcriptomes/wf-transcriptomes-1.1.1-no-1/differential_expression/chr20/ --jaffal_genome hg38_chr20 --jaffal_annotation genCode22 \ --out_dir Test-7 \ -profile singularity
Workflow Execution - CLI Execution Profile
singularity
What happened?
I have 2 errors to report.
1st error: fastcat: unrecognized option '--histograms' see output log
Fastcat v0.16.0 has the --histograms option. In the wf-transcriptomes_latest.sif image that I am using when looking for the fastcat version installed I get version 0.10.2:
apptainer run wf-transcriptomes_latest.sif Apptainer> pwd /home/epi2melabs/conda/bin Apptainer> fastcat -V 0.10.2
With « fastcat --help » command I do not see option '--histograms’, neither when I installed from source fastcat version 0.10.2. At least present in version 0.16.0.
Is it possible to update the image with version 0.16.0 for fastcat?
The error is in lib/ingress.nf file:
process fastcat { label "ingress" label "wf_common" cpus 3 memory "2 GB" input: tuple val(meta), path("input") val extra_args output: tuple val(meta), path("seqs.fastq.gz"), path("fastcat_stats") script: String out = "seqs.fastq.gz" String fastcat_stats_outdir = "fastcat_stats" """ mkdir $fastcat_stats_outdir fastcat \ -s ${meta["alias"]} \ -r >(bgzip -c > $fastcat_stats_outdir/per-read-stats.tsv.gz) \ -f $fastcat_stats_outdir/per-file-stats.tsv \ --histograms histograms \ $extra_args \ input \ | bgzip > $out mv histograms/* $fastcat_stats_outdir
extract the run IDs from the per-read stats
2nd error: Not in output log.
This error happened when I removed --histograms option in lib/ingress.nf file to see what was going on. csvtk: command not found
In image wf-transcriptomes_latest.sif :
apptainer run wf-transcriptomes_latest.sif Apptainer> pwd /home/epi2melabs/conda/bin Apptainer> ls
I do not see that it is installed… not in the list. https://github.com/shenwei356/csvtk
Thanks for your help.
Relevant log output
Application activity log entry
No response
Were you able to successfully run the latest version of the workflow with the demo data?
no
Other demo data information