maiziezhoulab / VolcanoSV

VolcanoSV enables accurate and robust structural variant calling in diploid genomes from single-molecule long read sequencing
MIT License
4 stars 0 forks source link

ValueError: too many values to unpack (expected 2) in volcanosv-vc-small-indel.py #7

Open DayTimeMouse opened 4 days ago

DayTimeMouse commented 4 days ago

Hi,

I have encountered another issue.

path_to_volcanosv=../VolcanoSV
python3 ${path_to_volcanosv}/bin/VolcanoSV-vc/Small_INDEL/volcanosv-vc-small-indel.py \
-i volcanosv_asm_output_tumor \
-o small_indel_output_tumor \
-bam hifi_tumor.bam \
-ref genome.fa \
-t 11 \
-px hifi_tumor

Error:

ValueError: too many values to unpack (expected 2)
 38%|█████████████████████████▋                                         | 144066/374930 [1:15:42<2:01:19, 31.72it/s]

Please see details in log. log.txt

The issue might be with coords.split('-').

def count_kmers_in_region(bam_file, region, kmer_size):
    kmers = {}
    chrom, coords = region.split(":")
    start, end = coords.split("-")
    start, end = int(start), int(end)
    x = 0
    dc = get_seq(bam_file, chrom, start, end)
    for qname, seq in dc.items():
        for i in range(len(seq) - kmer_size + 1):
            kmer = seq[i:i+kmer_size]
            if kmer in kmers:
                kmers[kmer] += 1
            else:
                kmers[kmer] = 1
        x+=1
    # print(f"num reads in {region}: {x}")
    # print(len(kmers))

    return kmers

Best regards.

volcano1998 commented 3 days ago

What is your reference's each chromosome's name like? I suspect this is the same issue as what caused the large indel failed. VolcaoSV expects the reference file to have names like chr1, chr2 etc.

DayTimeMouse commented 2 days ago

Yes, the format is chr1, chr2...