szpiech / selscan

Haplotype based scans for selection
GNU General Public License v3.0
109 stars 33 forks source link

--max-extend-nsl & --max-extend #63

Closed Siavash-cloud closed 3 years ago

Siavash-cloud commented 3 years ago

Hello, I want to know, is there any way to calculate the optimized values of "max-extend-nsl" and "max-extend"? Also, you mentioned set "max-extend-nsl" =< 0 for no restriction in the manual of software. When I set it to zero, then I will get empty files (as results) for all of chromosomes. Please consider that I am using a low density SNP panel (41K SNP) data. So, I wonder if I must use any specific option in selscan regarding the density of my SNP data?

Regards, Siavash

szpiech commented 3 years ago

Please see this post https://github.com/szpiech/selscan/issues/60#issuecomment-857721960 for a discussion on --max-extend-nsl which also applies to --max-extend. In short, you need not worry about optimization, but feel free to set it very large.

As for setting it =<0, if you are not getting results, then there is a bug in the program. Can you give more detail?

Siavash-cloud commented 3 years ago

I humbly thank you. Of course, this is what I have done for chromosome 1 in selscan:

./selscan --nsl --max-extend-nsl 0 --trunc-ok --vcf chr_1.recode.vcf --map gen_1.map --pmap --threads 8 --keep-low-freq --out Chr1

Log: selscan v1.3.0 Opening chr_1.recode.vcf... Loading 80 haplotypes and 3207 loci... Opening chr_1.recode.vcf... Loading map data for 3207 loci Starting nSL calculations with alt flag not set. |==============================================================================| Finished.

Please inform me if you need more information. The best, Siavash

Siavash-cloud commented 3 years ago

Sorry for the confusing, I found the problem. it is due to the gaps in my data (lots of "Reached a gap of 230920bp > 200000bp." in my log file). I have another question. How we can provide a suitable value for max-gap option to solve warnings? ./selscan --nsl --max-extend-nsl 0 --keep-low-freq --trunc-ok --vcf chr_1.recode.vcf --map gen_1.map --pmap --threads 8 --out Chr1 v1.3.0 Calculating nSL. Input filename: chr_1.recode.vcf Map filename: gen_1.map Output file: Chr1.nsl.out Threads: 8 Scale parameter: 20000 Max gap parameter: 200000 EHH cutoff value: 0.05 Alt flag set: no WARNING: Reached a gap of 230920bp > 200000bp. Skipping calculation at 1_261070 WARNING: Reached chromosome edge before EHH decayed below 0. WARNING: Reached a gap of 230920bp > 200000bp. Skipping calculation at 1_30150 WARNING: Reached a gap of 230920bp > 200000bp. Skipping calculation at 1_296408 WARNING: Reached a gap of 207939bp > 200000bp. Skipping calculation at 1_755781 WARNING: Reached a gap of 230920bp > 200000bp. Skipping calculation at 1_292214 WARNING: Reached chromosome edge before EHH decayed below 0. WARNING: Reached a gap of 207939bp > 200000bp. Skipping calculation at 1_889838 WARNING: Reached a gap of 230920bp > 200000bp. Skipping calculation at 1_29763 WARNING: Reached a gap of 207939bp > 200000bp. Skipping calculation at 1_889481 WARNING: Reached a gap of 207939bp > 200000bp. Skipping calculation at 1_684531 WARNING: Reached a gap of 207939bp > 200000bp. Skipping calculation at 1_1096558 WARNING: Reached a gap of 207939bp > 200000bp. Skipping calculation at 1_1080402 WARNING: Reached a gap of 207939bp > 200000bp. Skipping calculation at 1_1392777 WARNING: Reached a gap of 207939bp > 200000bp. Skipping calculation at 1_1346011 WARNING: Reached a gap of 207939bp > 200000bp. Skipping calculation at 1_1644004 WARNING: Reached a gap of 207939bp > 200000bp. Skipping calculation at 1_1610099 WARNING: Reached a gap of 207939bp > 200000bp. Skipping calculation at 1_1809193 WARNING: Reached a gap of 207939bp > 200000bp. Skipping calculation at 1_1909371 WARNING: Reached a gap of 207939bp > 200000bp. Skipping calculation at 1_2176819 WARNING: Reached a gap of 207939bp > 200000bp. Skipping calculation at 1_2126664 WARNING: Reached a gap of 207939bp > 200000bp. Skipping calculation at 1_2193530 WARNING: Reached a gap of 246316bp > 200000bp. Skipping calculation at 1_2827599 WARNING: Reached a gap of 207939bp > 200000bp. Skipping calculation at 1_2248907 WARNING: Reached a gap of 246316bp > 200000bp. Skipping calculation at 1_2929129 ...

Also, when I set max-gap to 1,000,000, I get the results but I also get it in my log file:

./selscan --nsl --max-extend-nsl 0 --max-gap 1000000 --keep-low-freq --trunc-ok --vcf chr_1.recode.vcf --map gen_1.map --pmap --threads 8 --out Chr1 v1.3.0 Calculating nSL. Input filename: chr_1.recode.vcf Map filename: gen_1.map Output file: Chr1.nsl.out Threads: 8 Scale parameter: 20000 Max gap parameter: 1000000 EHH cutoff value: 0.05 Alt flag set: no WARNING: Reached chromosome edge before EHH decayed below 0. WARNING: Reached chromosome edge before EHH decayed below 0. WARNING: Reached chromosome edge before EHH decayed below 0. WARNING: Reached chromosome edge before EHH decayed below 0. WARNING: Reached chromosome edge before EHH decayed below 0. WARNING: Reached chromosome edge before EHH decayed below 0. WARNING: Reached chromosome edge before EHH decayed below 0. WARNING: Reached chromosome edge before EHH decayed below 0. WARNING: Reached chromosome edge before EHH decayed below 0. WARNING: Reached chromosome edge before EHH decayed below 0. WARNING: Reached chromosome edge before EHH decayed below 0. WARNING: Reached chromosome edge before EHH decayed below 0. WARNING: Reached chromosome edge before EHH decayed below 0. ...

I wonder if you have any suggestion to solve warnings? The best, Siavash,

szpiech commented 3 years ago

One way to go about making this choice is to examine the distribution of gaps in your data and choose a value such that X% are smaller than that. Be careful about centromeres, you may need to filter your scores post-computation that are near the centromeres if your max gap value is larger than the gaps in your data caused by centromeres.

If you have --trunc-ok set, then you do not need to worry about WARNING: Reached chromosome edge before EHH decayed below 0. You'll still get values reported for the sites that triggered that warning.

Siavash-cloud commented 3 years ago

Thank you!