statgen / Minimac4

GNU General Public License v3.0
54 stars 17 forks source link

low overlapping between target panel and reference panel #64

Closed Zhenghongc closed 1 year ago

Zhenghongc commented 1 year ago

Hello !!!

I was doing genotype imputation with my data using minimac4. Error occurred that Chunk 1 has less than 0.1% of variants from the GWAS panel overlapping with the reference panel. Variant information was as follows:

Reference Panel : Found 504 samples (1008 haplotypes) and 42503 variants ... Target/GWAS Panel : Found 4418 samples (8836 haplotypes) and 16 variants ... 16 variants overlap with Reference panel 0 variants imported that exist only in Target/GWAS panel

I think the reason for the errors might be that I removed too many SNPs during quality control or that there are too few SNPs detected in this region. I'm wondering if it would be ok to set the value of minRatio very low (even as low as 0), and then filter the imputed data based on score info. Alternatively, when the overlapping ratio is very low, can I skip this chunk?

jonathonl commented 1 year ago

If this chunk corresponds to a 20 Mbp region, then I would lean towards excluding the chunk.

If you did lower --minRatio for this chunk, I would compare the distribution of INFO/ER2 among the 16 variants that overlap this chunk with the distribution of INFO/ER2 in other chunks that pass the default --minRatio threshold. If INFO/ER2 for this chunk are considerably lower than the other chunks, then it should definitely be excluded.

Zhenghongc commented 1 year ago

Thanks for your reply. The chunk size is 20Mbp so I think it’s better to exclude the chunk. Is there any command in minimac4 that can be used to skip the analysis when overlap under minRatio? I’m doing an imputation using nextflow pipeline so I would need to skip the analysis rather than exiting the program with an error.

Hongchen Zheng 北京大学肿瘤医院遗传学研究室 13051816085 @.*** 北京市海淀区阜成路52号

---- Replied Message ---- | From | Jonathon @.> | | Date | 09/05/2023 21:35 | | To | statgen/Minimac4 @.> | | Cc | Zhenghongc @.>, Author @.> | | Subject | Re: [statgen/Minimac4] low overlapping between target panel and reference panel (Issue #64) |

If this chunk corresponds to a 20 Mbp region, then I would lean towards excluding the chunk.

If you did lower --minRatio for this chunk, I would compare the distribution of INFO/ER2 among the 16 variants that overlap this chunk with the distribution of INFO/ER2 in other chunks that pass the default --minRatio threshold. If INFO/ER2 for this chunk are considerably lower than the other chunks, then it should definitely be excluded.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

jonathonl commented 1 year ago

I think the only way to automate this currently is to lower the minRatio, parse the log file to determine "bad" regions, and then exclude the "bad" regions with bcftools (or similar). An option to exclude in Minimac4 would be much better and I'll plan on adding this in the future. I don't have an ETA for when that would be added though.

On Sep 5, 2023, at 9:58 AM, Zhenghongc @.***> wrote:

Thanks for your reply. The chunk size is 20Mbp so I think it’s better to exclude the chunk. Is there any command in minimac4 that can be used to skip the analysis when overlap under minRatio? I’m doing an imputation using nextflow pipeline so I would need to skip the analysis rather than exiting the program with an error.

Hongchen Zheng 北京大学肿瘤医院遗传学研究室 13051816085 @.*** 北京市海淀区阜成路52号

---- Replied Message ---- | From | Jonathon @.> | | Date | 09/05/2023 21:35 | | To | statgen/Minimac4 @.> | | Cc | Zhenghongc @.>, Author @.> | | Subject | Re: [statgen/Minimac4] low overlapping between target panel and reference panel (Issue #64) |

If this chunk corresponds to a 20 Mbp region, then I would lean towards excluding the chunk.

If you did lower --minRatio for this chunk, I would compare the distribution of INFO/ER2 among the 16 variants that overlap this chunk with the distribution of INFO/ER2 in other chunks that pass the default --minRatio threshold. If INFO/ER2 for this chunk are considerably lower than the other chunks, then it should definitely be excluded.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***> — Reply to this email directly, view it on GitHub https://github.com/statgen/Minimac4/issues/64#issuecomment-1706679309, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA7BUBVWDZLJ35PLM276TQDXY4VZ3ANCNFSM6AAAAAA4LDNZEM. You are receiving this because you commented.

-- Jonathon LeFaive

jonathonl commented 10 months ago

A new option --min-ratio-behavior skip will skip chunks below --min-ratio threshold (https://github.com/statgen/Minimac4/commit/144874ddad15e5bc4cdba5561c060e426c4bcaf5).