mquinodo / AutoMap

Tool to find regions of homozygosity (ROHs) from sequencing data.
24 stars 9 forks source link

Why some of the ROH are not detected? #18

Closed Azjer2014 closed 1 year ago

Azjer2014 commented 1 year ago

I keep facing this issue that AutoMap does not detect all the ROHs and I can't figure out why. The parameters I chose should allow all the large ROHs to get detected. Do you know what can be the issue here? --DP 10 --percaltlow 0.3 --minsize 5

Thank you!

Azjer2014 commented 1 year ago

I think I figured out the problem. The issue here is that some regions have low mapping quality and heterozygous looking variants (likely false positive) come up in these regions as the result of misalignment. The region maps to GOLGA6C and GOLGA6D is an example in my dataset. Duplications carrying SNPs and SNVs have the same effect. Your tool is not prepared for dealing with these situations and therefore misses some ROHs or split them.

mquinodo commented 1 year ago

Dear Azjer, I suppose that your data is then of insufficient quality. Best, Mathieu

Azjer2014 commented 1 year ago

Hello Mathieu, Thanks for the response. The quality of my data is as good as any other short read Illumina data. There is always false positives as the result of misalignment to repetitive region or low complexity regions and etc. The problem is that the tool expects a perfect data which is unrealistic. I think a parameter that allows overlooking a certain number of heterozygous variants might improve the detection of ROH.