fritzsedlazeck / Sniffles

Structural variation caller using third generation sequencing
Other
561 stars 95 forks source link

sniffles.worker (3082238): Worker 2 received error: list index out of range #520

Closed bbimber closed 1 week ago

bbimber commented 2 weeks ago

Hello,

I installed the develop branch (sniffles 2.5) and am trying to call/merge 59 PacBio CCS and CLR datasets. I understand the dev branch is unable, but I was told you are about to release 2.5, which has some enhanced calling of large deletions. If my error is related to being on the unstable branch I can wait.

I first called each CRAM individually to generate snf files. I then ran sniffles as follows to merge them, where fileList.tsv has one line for snf file:

$SNIFFLES \
    --no-progress \
    --input fileList.tsv \
    --allow-overwrite \
    --vcf $VCF

This gives the following error at the end of the log:

2024-11-09 08:37:28,236 INFO sniffles.main (3082238):
2024-11-09 08:38:04,326 ERROR sniffles.main (3082238): Unhandled error while running sniffles.
Traceback (most recent call last):
  File "/home/users/bimber/.local/bin/sniffles", line 538, in <module>
    Sniffles2_Main(processes)
  File "/home/users/bimber/.local/bin/sniffles", line 471, in Sniffles2_Main
    t.result.emit(vcf_out=vcf_out, snf_out=snf_out, **rkwargs)
    ^^^^^^^^^^^^^
AttributeError: 'ErrorResult' object has no attribute 'emit'

When I looked earlier int he log, I see this, which seems like it might be the actual problem (although maybe sniffles should die earlier if this happens):

2024-11-09 08:10:40,820 INFO sniffles.worker (3082238): Dispatched task #5509 to worker 3 (1304  tasks left)
2024-11-09 08:10:41,933 ERROR sniffles.worker (3082243): Error in worker process
Traceback (most recent call last):
  File "/home/users/bimber/.local/lib/python3.11/site-packages/sniffles/parallel.py", line 620, in run_worker
    result = task.execute(self)
             ^^^^^^^^^^^^^^^^^^
  File "/home/users/bimber/.local/lib/python3.11/site-packages/sniffles/parallel.py", line 449, in execute
    result.store_calls(calls)
  File "/home/users/bimber/.local/lib/python3.11/site-packages/sniffles/result.py", line 157, in store_calls
    while svcalls[offset].pos < self._highest_position_call:
          ~~~~~~~^^^^^^^^
IndexError: list index out of range
2024-11-09 08:10:41,933 ERROR sniffles.worker (3082238): Worker 2 received error: list index out of range

Is this a known issue? Thanks for any help or ideas. The full log is attached:

snifflesMerge.txt

hermannromanek commented 2 weeks ago

Hi @bbimber

Thanks for the report and for testing the new version! I'll have a look and try to fix it asap.

Hermann

bbimber commented 2 weeks ago

@hermannromanek, yes you are right in your PR about that check already existing.

I forked your repo and was going to add some debugging. Do you have any suggestion on either checks or additional logging? I can easily re-run this on the problematic dataset: https://github.com/fritzsedlazeck/Sniffles/blob/debb998dc759fc76b2001d2c31c2bfe449e9a3c8/src/sniffles/result.py#L159

bbimber commented 2 weeks ago

Would this be encountered if len(svcalls) == 1? In this case, sorting is irrelevant anyway, right?

hermannromanek commented 2 weeks ago

Yes, I'm also thinking the problem is offset running out of bounds - can you try running the version in branch https://github.com/fritzsedlazeck/Sniffles/tree/issue520 i just pushed?

Although I'm still trying to also construct test cases to reproduce it this should fix it.

Thanks, Hermann

bbimber commented 2 weeks ago

@hermannromanek: thank you for the fast fix - that did work.

I noticed one thing: the job left a huge number of files with names like "result-59198-6730-unsorted.part.vcf". Should these be deleted?

Here is the tail of the log, and it seems like sniffles2 finished normally. I dont see the words 'error' or 'exception' anywhere in the log, and nothing else that seemed like errors:

2024-11-10 11:34:28,926 INFO sniffles.worker (3596699): Worker 9 done (code 0).
2024-11-10 11:34:28,926 INFO sniffles.worker (3596699): Worker 10 done (code 0).
2024-11-10 11:34:28,926 INFO sniffles.worker (3596699): Worker 11 done (code 0).
2024-11-10 11:34:28,926 INFO sniffles.main (3596699): Took 5199.70s.
2024-11-10 11:34:28,926 INFO sniffles.main (3596699): 
2024-11-10 11:36:09,818 INFO sniffles.main (3596699): Wrote 656835 called SVs to ./merge/PacBio.59.sniffles2.vcf (multi-sample, sorted)
hermannromanek commented 2 weeks ago

The merge for big input data sets doesn't yet support sorting, so SVs that are far out of order are written to those extra files. So those contain actual merged variants. We currently run this pipeline with the --no-sort option and sort afterwards using bcftools, something we'll have to add to the release notes.

"Big input dataset" is controlled by the argument --combine-max-inmemory-results defaulting to 20, so any merge on more than this number of files will exhibit this behaviour.

I will also add a warning explaining this when running with sorting enabled on a number of input files that does not support it.

bbimber commented 2 weeks ago

@hermannromanek: ok, I can understand that. Thanks for the investigation. A couple comments:

lfpaulin commented 1 week ago

Hi @bbimber we will implement sorting for large datasets sometime in the future

bbimber commented 1 week ago

OK. honestly it's really not that big a problem so long as the tool is clear about what it does or does not do. Low priority from our perspective.