Open Npaffen opened 1 year ago
Hello, I'm encountering the same error mentioned earlier: "ERROR: Three files overlapping at position: 96027772." using GLIMPSE2 ligate. Is there a possible solution to address this issue?
Hello I am encountering the same error. I dealt with it by deleting one of the three files that contained the position that was causing trouble. Obviously not ideal, but at least got me a file with most of my variants
I'm also encountering this error: ERROR: Three files overlapping at position: 125160986 during the ligate step. Has there been any progress on this?
Same error here. Any updates on that?
Hi, Thanks for reporting this. This is likely due to the chunking: the variants is present in more than two chunks (likely because of the large buffer). You can safely remove the variant from the first of the third file. Please adjust the three chunks in your file so that this won't again happen in the future for subsequent imputation runs.
Will put a check at the chunking level.
Simone
Same error encounted. I found that the chunk files indeed overlap if follow the tutorials, some of them larger than 1M. I was expecting to have two ways to slove this issue:
Same error encounted. I found that the chunk files indeed overlap if follow the tutorials, some of them larger than 1M. I was expecting to have two ways to slove this issue:
- In the imputation step, calculate with overlapping region but output non-overlap results. I found there are two parameters '--input-region' and '--output-region', but the description confuse me: --input-region arg Imputation region with buffers --output-region arg Imputation region without buffers Can you confirm that these two parameters can do what I expected?
- Another way is to collapse the duplicate variants in the ligate step. But GLIMPSE_ligate dose not provide such option, i think bcftools 'concat' can achive this task, but not sure if there is something special behavious of GLIMPSE_ligate?
Regarding #2: I am not a GLIMPSE author, but to my understanding GLIMPSE ligate takes into account the phasing information when ligating, and bcftools concat doesn't.
Same error encounted. I found that the chunk files indeed overlap if follow the tutorials, some of them larger than 1M. I was expecting to have two ways to slove this issue:
- In the imputation step, calculate with overlapping region but output non-overlap results. I found there are two parameters '--input-region' and '--output-region', but the description confuse me: --input-region arg Imputation region with buffers --output-region arg Imputation region without buffers Can you confirm that these two parameters can do what I expected?
- Another way is to collapse the duplicate variants in the ligate step. But GLIMPSE_ligate dose not provide such option, i think bcftools 'concat' can achive this task, but not sure if there is something special behavious of GLIMPSE_ligate?
Regarding #2: I am not a GLIMPSE author, but to my understanding GLIMPSE ligate takes into account the phasing information when ligating, and bcftools concat doesn't.
bcftools concat (v1.16) has an option:
-l, --ligate Ligate phased VCFs by matching phase at overlapping haplotypes
which I guess it does similar
Same error encounted. I found that the chunk files indeed overlap if follow the tutorials, some of them larger than 1M. I was expecting to have two ways to slove this issue:
- In the imputation step, calculate with overlapping region but output non-overlap results. I found there are two parameters '--input-region' and '--output-region', but the description confuse me: --input-region arg Imputation region with buffers --output-region arg Imputation region without buffers Can you confirm that these two parameters can do what I expected?
- Another way is to collapse the duplicate variants in the ligate step. But GLIMPSE_ligate dose not provide such option, i think bcftools 'concat' can achive this task, but not sure if there is something special behavious of GLIMPSE_ligate?
Regarding #2: I am not a GLIMPSE author, but to my understanding GLIMPSE ligate takes into account the phasing information when ligating, and bcftools concat doesn't.
bcftools concat (v1.16) has an option:
-l, --ligate Ligate phased VCFs by matching phase at overlapping haplotypes
which I guess it does similar
Thanks, I was not aware that they added this feature. In that case- I do not know what the advantage of using GLIMPSE ligate is.
[GLIMPSE2] Ligate multiple output files into chromosome-wide files
Files:
Parameters:
Threads : [1]
Read filenames in [GLIMPSE_ligate/list.chr21.txt]
files = 14
Ligating chunks
samples = 1
Cnk 0 [chr21:5030578-14572952] [L=128973] Buf 0 [chr21:14572990-15573250] [L_isec=23597 / L_tot=23597] [Avg #hets=588] [Switch rate=1] [Avg phaseQ=27.1098] Cnk 1 [chr21:15573328-18750380] [L=76316] Buf 1 [chr21:18750416-20268812] [L_isec=37008 / L_tot=37008] [Avg #hets=830] [Switch rate=0] [Avg phaseQ=2.45251]
ERROR: Three files overlapping at position: 18750486
I guess this is partly related to this error. I'm running an imputation on low-coverage sample while also having phased WGS data of the parents added to the reference panel. In this case I tried a test run with the trio from the 1KG HG02024 (child) and HG02025 and HG02026 parents. I'm unsure if these information are related to the problem but I thought it might be useful to add them.
How can I ligate the phased and imputed GLIMPSE2 chunks in a meaningful way. Do I need to adjust the chunk bins? I already followed the guideline of the tutorial to achieve this results. When I do not add the parents to the reference panel the whole GLIMPSE2 pipeline runs without an error!
Feel free to ask for any kind of information or clarification. I'm really impressed by the results so far and look forward to hopefully boost them if I can use parental data in the imputation process!
Best regards, Nils