odelaneau / GLIMPSE

Low Coverage Calling of Genotypes
MIT License
136 stars 26 forks source link

Buffer bug??? #206

Open leedchou opened 5 months ago

leedchou commented 5 months ago

Hi,

I was trying different window and buffer parameters, getting weird results. When I run chunk with window-cm and window-mb less equal 1, whatever buffer-cmand buffer-mb is, buffer is always like +- 1,000,000bp based on window size. Below is an example:

0   chr1    chr1:1-2000069  chr1:1-1000066  1   1000003 33883   33471
1   chr1    chr1:51-3000098 chr1:1000067-2000088    1   1000005 55170   54534
2   chr1    chr1:1000079-4000121    chr1:2000089-3000102    1.00001 1000006 57783   56823
3   chr1    chr1:2000106-5000130    chr1:3000103-4000146    1.00001 1000014 54423   53362
4   chr1    chr1:3000172-6000191    chr1:4000147-5000194    1.00002 1000019 50088   49094

Moreover, it caused errors when running ligate like: ERROR: Three files overlapping at position: 1005908

Is that a bug?

Best regards, Leed

datngu commented 4 months ago

Hi, I have the same issue. Do you know the solution to avoid this error?

Best, Dat

LouisLeNezet commented 1 month ago

I'm struggling with the same error on a small dataset. This error is not present in GLIMPSE version1. Here is some code to replicate the error.

wget https://raw.githubusercontent.com/nf-core/test-datasets/phaseimpute/hum_data/panel/chr21/1000GP.chr21.s.norel.vcf.gz
wget https://raw.githubusercontent.com/nf-core/test-datasets/phaseimpute/hum_data/panel/chr21/1000GP.chr21.s.norel.vcf.gz.csi

GLIMPSE_chunk \
        --input 1000GP.chr21.s.norel.vcf.gz --region chr21:16570000-16610000 \
        --window-size 10000 --window-count 400 --buffer-size 1000 --buffer-count 30 \
        --output test_chunks.txt # Give +buffer:[chr21:16570070-16592229] and +buffer:[chr21:16588251-16609998]

GLIMPSE2_chunk \
        --input 1000GP.chr21.s.norel.vcf.gz --region chr21:16570000-16610000 \
        --sequential --window-mb 0.01 --window-cm 0.01 --window-count 400 --buffer-mb 0.001 --buffer-cm 0.001 --buffer-count 30 \
        --output test_chunks.txt # Give Segmentation fault

GLIMPSE2_chunk \
        --input 1000GP.chr21.s.norel.vcf.gz --region chr21:16570000-16610000 \
        --sequential --window-mb 0.01 --window-cm 0.01 --window-count 200 --buffer-mb 0.001 --buffer-cm 0.001 --buffer-count 30 \
        --output test_chunks.txt # Give +buffer:[chr21:16570070-16609998] and +buffer:[chr21:16570070-16609998]