kcleal / dysgu

Toolkit for calling structural variants using short or long reads
MIT License
88 stars 10 forks source link

Dysgu filter IndexError: string index out of range #69

Closed tbenavi1 closed 9 months ago

tbenavi1 commented 9 months ago

Hello I ran the dysgu filter command and got the following error:

Traceback (most recent call last):
  File "/home/rranallo-benavidez/mambaforge/envs/snakemake/bin/dysgu", line 8, in <module>
    sys.exit(cli())
             ^^^^^
  File "/home/rranallo-benavidez/mambaforge/envs/snakemake/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rranallo-benavidez/mambaforge/envs/snakemake/lib/python3.11/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/home/rranallo-benavidez/mambaforge/envs/snakemake/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rranallo-benavidez/mambaforge/envs/snakemake/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rranallo-benavidez/mambaforge/envs/snakemake/lib/python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rranallo-benavidez/mambaforge/envs/snakemake/lib/python3.11/site-packages/click/decorators.py", line 33, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rranallo-benavidez/mambaforge/envs/snakemake/lib/python3.11/site-packages/dysgu/main.py", line 467, in filter_normal
    filter_normals.run_filtering(ctx.obj)
  File "/home/rranallo-benavidez/mambaforge/envs/snakemake/lib/python3.11/site-packages/dysgu/filter_normals.py", line 907, in run_filtering
    good = process_intra(r, posB, bams, infile, bam_is_paired_end, support_fraction, pad=pad, sample=sample_name, keep_all=keep_all)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rranallo-benavidez/mambaforge/envs/snakemake/lib/python3.11/site-packages/dysgu/filter_normals.py", line 829, in process_intra
    if cached and matching_soft_clips(r, cached, is_paired_end):
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rranallo-benavidez/mambaforge/envs/snakemake/lib/python3.11/site-packages/dysgu/filter_normals.py", line 594, in matching_soft_clips
    break_seqs = BreakSeqs(r)
                 ^^^^^^^^^^^^
  File "/home/rranallo-benavidez/mambaforge/envs/snakemake/lib/python3.11/site-packages/dysgu/filter_normals.py", line 287, in __init__
    left_clip, right_clip = get_contig_clipped_bases(r.info["CONTIGA"])
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rranallo-benavidez/mambaforge/envs/snakemake/lib/python3.11/site-packages/dysgu/filter_normals.py", line 264, in get_contig_clipped_bases
    while cont[i].islower():
          ~~~~^^^
IndexError: string index out of range

Any advice? Let me know if you need further information from me. Thanks.

kcleal commented 9 months ago

Hi @tbenavi1, Thanks for reporting this. It looks like a simple index error which should be easy for me to fix. I will take a look today. Would you mind sharing the upstream commands you used for calling and filtering, just for me to check if there it anything else in it? Also could you confirm which version of dysgu you are using, thanks!

kcleal commented 9 months ago

Ive added a fix to the dysgu_dev branch on github https://github.com/kcleal/dysgu/tree/dysgu_dev. You will have to build dysgu from source whilst I finish some other changes being made to version 1.6.1. This can be done by cloning the dev_branch and running the INSTALL.sh script.

tbenavi1 commented 9 months ago

Thank you. I will take a look and see if this fixes my issue.

tbenavi1 commented 9 months ago

Hello, I installed the dev branch (commit 04eb5d1e7795ab2e27e1dd9a9e52baf2afee7cf7), but I still have the issue. I ran with the same file and got:

Traceback (most recent call last):
  File "/home/rranallo-benavidez/mambaforge/envs/snakemake/bin/dysgu", line 33, in <module>
    sys.exit(load_entry_point('dysgu==1.6.1', 'console_scripts', 'dysgu')())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rranallo-benavidez/mambaforge/envs/snakemake/lib/python3.11/site-packages/click/core.py", line 1157, in __call__                                                            
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rranallo-benavidez/mambaforge/envs/snakemake/lib/python3.11/site-packages/click/core.py", line 1078, in main                                                                
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/home/rranallo-benavidez/mambaforge/envs/snakemake/lib/python3.11/site-packages/click/core.py", line 1688, in invoke                                                              
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rranallo-benavidez/mambaforge/envs/snakemake/lib/python3.11/site-packages/click/core.py", line 1434, in invoke                                                              
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rranallo-benavidez/mambaforge/envs/snakemake/lib/python3.11/site-packages/click/core.py", line 783, in invoke                                                               
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rranallo-benavidez/mambaforge/envs/snakemake/lib/python3.11/site-packages/click/decorators.py", line 33, in new_func                                                        
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rranallo-benavidez/mambaforge/envs/snakemake/lib/python3.11/site-packages/dysgu/main.py", line 467, in filter_normal                                                        
    filter_normals.run_filtering(ctx.obj)
  File "/home/rranallo-benavidez/mambaforge/envs/snakemake/lib/python3.11/site-packages/dysgu/filter_normals.py", line 907, in run_filtering                                              
    good = process_intra(r, posB, bams, infile, bam_is_paired_end, support_fraction, pad=pad, sample=sample_name, keep_all=keep_all)                                                      
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                      
  File "/home/rranallo-benavidez/mambaforge/envs/snakemake/lib/python3.11/site-packages/dysgu/filter_normals.py", line 829, in process_intra                                              
    if cached and matching_soft_clips(r, cached, is_paired_end):
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rranallo-benavidez/mambaforge/envs/snakemake/lib/python3.11/site-packages/dysgu/filter_normals.py", line 594, in matching_soft_clips                                        
    break_seqs = BreakSeqs(r)
                 ^^^^^^^^^^^^
  File "/home/rranallo-benavidez/mambaforge/envs/snakemake/lib/python3.11/site-packages/dysgu/filter_normals.py", line 287, in __init__                                                   
    left_clip, right_clip = get_contig_clipped_bases(r.info["CONTIGA"])
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rranallo-benavidez/mambaforge/envs/snakemake/lib/python3.11/site-packages/dysgu/filter_normals.py", line 264, in get_contig_clipped_bases                                   
    while cont[i].islower():
          ~~~~^^^
IndexError: string index out of range

For reference, the filter command I ran was

dysgu filter --normal-vcf results/dysgu/NHA_p1/NHA_p1.svs.vcf -p 32 results/dysgu/NHA_E6E7_D12_PDL50/NHA_E6E7_D12_PDL50.svs.vcf results/BAMS/NHA_p1/T2T.NHA_p1.sorted.bam > results/dysgu/NHA_E6E7_D12_PDL50/NHA_E6E7_D12_PDL50.somatic.vcf

And the call command I ran was:

dysgu call --mode nanopore --mq 20 --overwrite -p 32 {input.ref} {params.tmp} {input.bam} > results/dysgu/NHA_E6E7_D12_PDL50/NHA_E6E7_D12_PDL50.svs.vcf 
kcleal commented 9 months ago

Hi @tbenavi1, I think you might be accidentally using the master branch still. Line 264 in filter_normals.py has been changed on the Dev branch. https://github.com/kcleal/dysgu/blob/dysgu_dev/dysgu/filter_normals.py#L264

kcleal commented 9 months ago

1.6.1 should be available tomorrow from pypi, if you don't mind waiting