epi2me-labs / wf-single-cell

Other
74 stars 39 forks source link

Error executing process > 'pipeline:process_bams:combine_bams_and_tags (1)' #81

Closed ktpolanski closed 6 months ago

ktpolanski commented 8 months ago

Operating System

Other Linux (please specify below)

Other Linux

No response

Workflow Version

v1.1.0

Workflow Execution

Command line

EPI2ME Version

No response

CLI command run

~/nextflow-23.12.0-edge-all run epi2me-labs/wf-single-cell \
    --fastq fastq/ \
    --kit_name multiome \
    --kit_version v1 \
    --expected_cells 5000 \
    --ref_genome_dir /home/ubuntu/cellranger/GRCh38-2020-A/ \
    --sample $SAMPLE \
    -c openstack.cfg \
    --max_threads 20 \
    -profile standard \
    -resume

Workflow Execution - CLI Execution Profile

standard (default)

What happened?

I am rerunning a sample that has previously worked fine on older versions of the workflow, most recently -r prerelease under v1.0.3 finishing on March 7th. I am encountering an error which I have previously not seen. I checked the git blame and it seems the part of the workflow that is causing the explosion was recently modified.

It seems the samtools part of the command runs fine. There is a tags folder with 20 symlinked TSVs, each 452MB in size. There is a chr_tags folder but it's empty.

Relevant log output

ERROR ~ Error executing process > 'pipeline:process_bams:combine_bams_and_tags (1)'

Caused by:
  Process `pipeline:process_bams:combine_bams_and_tags (1)` terminated with an error exit status (255)

Command executed:

  samtools merge -@ 7 --write-index -o "PAQ62150.tagged.sorted.bam##idx##PAQ62150.tagged.sorted.bam.bai" bams/*.bam

  mkdir chr_tags
  # merge the tags TSVs, keep header from first
  csvtk concat -tT tags/*         | csvtk split -tl -f chr -o chr_tags/
  # Strip appended source filename ("stdin-"") from the split TSVs
  for file in chr_tags/*; do mv "${file}" "${file//stdin-//}"; done

Command exit status:
  255

Command output:
  (empty)

Command error:
  WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
  [ERRO] xopen: no content

Application activity log entry

No response

Were you able to successfully run the latest version of the workflow with the demo data?

yes

Other demo data information

The demo data does clear the step.
ddiez commented 8 months ago

I got the same error, with the same exit error (255) and command error ([ERRO] xopen: no content), and the folder chr_tags is also empty. I tried running the csvtk commands manually on the existing files and they produced the expected output in the folder chr_tags.

ktpolanski commented 8 months ago

I pulled out a few of the many thousand input FASTQ files and the command cleared the step. Seems to be something about input girth?

ktpolanski commented 7 months ago

I gave -profile singularity a try and the process just cleared. This is not the same exact compute environment as what I encountered this on, as Singularity just refused to behave there (#87). But hey, progress!

cjw85 commented 7 months ago

We've found there's a bug in the version of csvtk thats being used in the workflow (https://github.com/shenwei356/csvtk/issues/259). We've replaced the use of csvtk in our development branch.

ddiez commented 7 months ago

@cjw85 Thanks. I can confirm that the latest prerelease version solves this issue.

cjw85 commented 7 months ago

We have not made any updates to the code since this issue was reported.

ddiez commented 7 months ago

You mean the "development branch" is not the prerelease one? Well, then for whatever reason the latest version in prerelease did not stop at that point anymore.

cjw85 commented 7 months ago

My apologies, yes the prerelease branch on GitHub tracks our internal mainline dev branch. It does contain changes to the 'pipeline:process_bams:combine_bams_and_tags stage of the workflow.

ktpolanski commented 7 months ago

I'm not fully following.

At the time of encountering the issue, I had 20 symlinked TSVs in the input folder, each 452MB in size. So nothing seems like it was empty. I moved to Singularity and somehow the problem went away, despite me not switching to prerelease.

cghmyway commented 7 months ago
~/nextflow-23.12.0-edge-all run epi2me-labs/wf-single-cell \
    --fastq fastq/ \
    --kit_name multiome \
    --kit_version v1 \
    --expected_cells 5000 \
    --ref_genome_dir /home/ubuntu/cellranger/GRCh38-2020-A/ \
    --sample $SAMPLE \
    -c openstack.cfg \
    --max_threads 20 \
    -profile standard \
    -resume

We used the same code, but the wf-single-cell software returned the same error. It seems that the problem has not been resolved, and we were using version v1.1.0. Similarly, using the '-profile singularity' option resulted in another error.

ktpolanski commented 7 months ago

Try -r prerelease, as per devs above that should circumvent the problematic process.

I still don't get how me switching to singularity helped, but somehow it did.

cjw85 commented 6 months ago

The fox for this issue is now included in V2.0.0.