bmvdgeijn / WASP

WASP: allele-specific pipeline for unbiased read mapping and molecular QTL discovery
Apache License 2.0
102 stars 51 forks source link

find_intersecting_snps.py: for pair in product(new_reads[0][group], new_reads[1][group]): IndexError: list index out of range #110

Open jegalle opened 2 years ago

jegalle commented 2 years ago

Hi

I'm currently having issues in the find_intersecting_snps.py i've done some prints to have more info: Starting from line 825 this now looks like:

print(new_reads)
for group in range(len(new_reads[0])):
print(group) for pair in product(new_reads[0][group], new_reads[1][group]): if len(unique_pairs) <= max_seqs: unique_pairs.add(pair) else: return False return unique_pairs

When running this code I'm having the following error, somewhere in chr2.:

[[{'GGTGAAACCCCGTCTCTACTAAAAATACAAACAATTAGCCAGTCNTCG'}], [{'GGGAGGCTGGGGCAGAGAATTGCTTGAACCCAGGAGGCAGAGGTTGCA', 'GGGAGGCTGGGGCAGAGAATTGCTTGAACCCGGGAGGCAGAGGTTGCA'}]] 0 [[{'CGGGCATGGTGGCACAAGCCTGTAGTCCCAGCTATTCAGGAGGCCGAG'}, {'CGGGCATGGTGGCACAAGCCTGTAGTCCCAGCTATTCAGGAGGCTGAG'}], [{'CAGGAGGCTGAGGTGGGAAGATCACTTGAACCCGGGAGGTTGAGGCCG', 'CAGGAGGCTGAGGTGGGAAGATCACTTGAACCTGGGAGGTTGAGGCTG', 'CAGGAGGCTGAGGTGGGAAGATCACTTGAACCTGGGAGGTTGAGGCCG', 'CAGGAGGCTGAGGTGGGAAGATCACTTGAACCCGGGAGGTTGAGGCTG'}]] 0 1 Traceback (most recent call last): File "WASP/mapping/find_intersecting_snps_edited.py", line 1096, in samples=samples) File "WASP/mapping/find_intersecting_snps_edited.py", line 1069, in main samples=samples) File "WASP/mapping/find_intersecting_snps_edited.py", line 737, in filter_reads max_snps) File "WASP/mapping/find_intersecting_snps_edited.py", line 912, in process_paired_read max_seqs, pair_snp_idx, pair_snp_read_pos File "WASP/mapping/find_intersecting_snps_edited.py", line 828, in read_pair_combos for pair in product(new_reads[0][group], new_reads[1][group]): IndexError: list index out of range Closing remaining open files:snps/genotypes/haps.h5...donesnps/genotypes/snp_tab.h5...donesnps/genotypes/snp_index.h5...done

Any ideas on what's going on? Thanks! Jeroen

tianwen0003 commented 2 years ago

I found this error is caused because the function group_reads_by_snps invoked by the function read_pair_combos did not merge the candidate variants-overlapped-read1 to one set appropriately, instead put them in seperate sets. I add follwing code between line 821 and line 822 of the find_intersecting_snps.py

if len(new_reads[0]) > 1:
    original_new_reads_0 = new_reads[0]
    justed_new_reads_0 = set()
    for i in original_new_reads_0:
        for j in i:
            justed_new_reads_0.add(j)
    justed_new_reads_0 = [justed_new_reads_0]
    new_reads[0] = justed_new_reads_0

if len(new_reads[1]) > 1:
    original_new_reads_1 = new_reads[1]
    justed_new_reads_1 = set()
    for i in original_new_reads_1:
        for j in i:
            justed_new_reads_1.add(j)
    justed_new_reads_1 = [justed_new_reads_1]
    new_reads[1] = justed_new_reads_1