alachins / raisd

RAiSD: software to detect positive selection based on multiple signatures of a selective sweep and SNP vectors
33 stars 13 forks source link

introduce mappable sites into calculation #16

Open rossibarra opened 4 years ago

rossibarra commented 4 years ago

In many genomes (most plants, for example), there is an abundance of repetitive DNA that is recalcitrant to short read mapping. This means that if I break my genome down into fixed-width windows, I will have many 10kb windows, for example, where I can only sequence <<10kb because of repetitive elements. When calculating mu^VAR this becomes a problem, because a bigger physical distance for a fixed number of SNPs is interpreted as lower diversity, when it could be simply do to repeats decreasing the bp that can actually be interpreted. It seems like it shouldn't be too hard to allow users to include a maskfile which lists unmappable sites, and then to use this when calculating the numerator for mu^VAR.

alachins commented 4 years ago

Thank you for this suggestion. I am currently working on a series of updates that are going to be released early January. I will implement your suggestion as well.

idaios commented 4 years ago

Hi Jeff, It's easy to be implemented. Would it be better for the user to provide a file with the mappable regions or a file with the unmappable regions?

Pavlos

rossibarra commented 4 years ago

Gah, sorry not sure how I missed this. Either are easy for users with bedtools.

akcorut commented 2 years ago

I am hoping to use your tool with a genome that potentially can suffer from this same issue and I was wondering if this issue was already resolved. Is there a function now that deals with this problem?

Thanks, Kivanc

alachins commented 2 years ago

You could use -X to provide a file with unmappable regions that will be excluded from the scan. You can find the format of this file here: https://github.com/alachins/raisd#excluding-regions-from-the-analysis

Best regards, Nikos

On Fri, Jun 24, 2022 at 11:28 PM Kivanc Corut @.***> wrote:

I am hoping to use your tool with a genome that potentially can suffer from this same issue and I was wondering if this issue was already resolved. Is there a function now that deals with this problem?

Thanks, Kivanc

— Reply to this email directly, view it on GitHub https://github.com/alachins/raisd/issues/16#issuecomment-1165911720, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALKWCTOWZF5WZO6457J2GDVQYK6BANCNFSM4J44CVYQ . You are receiving this because you commented.Message ID: @.***>

-- Nikolaos Alachiotis