cortes-ciriano-lab / savana

Somatic structural variant caller for long-read data
Apache License 2.0
43 stars 2 forks source link

Field index out of range #41

Closed jakelee0711 closed 1 month ago

jakelee0711 commented 1 month ago

Hi colleagues,

This is Jake Lee at Memorial Sloan Kettering Cancer Center. First of all, thank you so much for developing this wonderful tool. I would love to try the recently added copy number feature, so tried a couple of attempts during this weekend after re-installing SAVANA. I would like to troubleshoot my errors under your guidance if possible. Previously I used version 1.0.5, which showed good performance in calling SVs. Now I am using the version 1.2.0. For the phased_vcf, I used my vcf file generated from deepvariant-margin-whatshap pipeline. My reference genome is chm13v2.0 (T2T).

First, I tried to run "savana cna" based on the previous breakpoint.vcf file from version 1.0.5 ("somatic.classified.breakpoint.vcf"). But I got this error message below:

multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/data1/shahs3/users/leej39/environment/miniconda3/envs/savana_env/lib/python3.9/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/data1/shahs3/users/leej39/environment/miniconda3/envs/savana_env/lib/python3.9/multiprocessing/pool.py", line 51, in starmapstar
    return list(itertools.starmap(args[0], args[1]))
  File "/data1/shahs3/users/leej39/environment/miniconda3/envs/savana_env/lib/python3.9/site-packages/savana/read_counter.py", line 59, in binned_read_counting
    blacklist = float(bin[4])
  File "pybedtools/cbedtools.pyx", line 524, in pybedtools.cbedtools.Interval.__getitem__
IndexError: field index out of range
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/data1/shahs3/users/leej39/environment/miniconda3/envs/savana_env/bin/savana", line 8, in <module>
    sys.exit(main())
  File "/data1/shahs3/users/leej39/environment/miniconda3/envs/savana_env/lib/python3.9/site-packages/savana/savana.py", line 496, in main
    args.func(args)
  File "/data1/shahs3/users/leej39/environment/miniconda3/envs/savana_env/lib/python3.9/site-packages/savana/savana.py", line 201, in savana_cna
    read_counts_path = read_counter.count_reads(outdir, args.tumour, args.normal, args.panel_of_normals, args.sample, bin_annotations_path, args.readcount_mapq, args.blacklisting, args.bl_threshold, args.bases_filter, args.bases_threshold, args.threads)
  File "/data1/shahs3/users/leej39/environment/miniconda3/envs/savana_env/lib/python3.9/site-packages/savana/read_counter.py", line 176, in count_reads
    countData = [x for xs in list(pool.starmap(binned_read_counting, args_in)) for x in xs]
  File "/data1/shahs3/users/leej39/environment/miniconda3/envs/savana_env/lib/python3.9/multiprocessing/pool.py", line 372, in starmap
    return self._map_async(func, iterable, starmapstar, chunksize).get()
  File "/data1/shahs3/users/leej39/environment/miniconda3/envs/savana_env/lib/python3.9/multiprocessing/pool.py", line 771, in get
    raise self._value
IndexError: field index out of range

I also tried running the new savana 1.2.0 from the scratch, by calling SVs again and analyzing CNAs, but it also gives me the exact same error message. In terms of the output files, I see that 10kbp_bin_ref...bed file is generated. The last part of the output file is as below.

chrY    62400001    62410000    99.99
chrY    62410001    62420000    99.99
chrY    62420001    62430000    99.99
chrY    62430001    62440000    99.99
chrY    62440001    62450000    99.99
chrY    62450001    62460000    99.99
chrY    62460001    62460029    96.55172413793103

Any thoughts how I could fix this error will be greatly appreciated!

Best,

Jake June-Koo Lee, MD, PhD

cmsauer commented 1 month ago

Hi Jake, thank you for your comment! Please could you share some more detail on how exactly you are running savana cna? Are you providing a blacklist and which flags have you specified/used? Thank you! Carolin