odelaneau / shapeit4

Segmented HAPlotype Estimation and Imputation Tool
MIT License
89 stars 17 forks source link

Segmentation fault #37

Closed verne91 closed 3 years ago

verne91 commented 3 years ago

Hello, I came across the segmentation fault at the very beginning of running Shapeit4.

shapeit4 --input 22.snps..vcf.gz --map chr22.b37.gmap.gz --region 22 --use-PS 0.0001 --thread 4 --log 22.phased.log --output 22.phased.vcf

SHAPEIT
  * Author        : Olivier DELANEAU, University of Lausanne
  * Contact       : olivier.delaneau@gmail.com
  * Version       : 4.1.3
  * Run date      : 15/10/2020 - 10:13:30

Files:
  * Input VCF     : [22.snps.vcf.gz]
  * Genetic Map   : [chr22.b37.gmap.gz]
  * Output VCF    : [22.phased.vcf]
  * Output LOG    : [22.phased.log]

Parameters:
  * Seed    : 15052011
  * Threads : 4 threads
  * MCMC    : 15 iterations [5b + 1p + 1b + 1p + 1b + 1p + 5m]
  * PBWT    : Depth of PBWT neighbours to condition on: 4
  * PBWT    : Store indexes at variants [MAC>=2 / MDR<=0.5 / Dist=0.02 cM]
  * HMM     : K is variable / min W is 2.50cM / Ne is 15000
  * HMM     : Recombination rates given by genetic map
  * HMM     : Inform phasing using VCF/PS field / Error rate of PS field is 0.0001
  * HMM     : !AVX2 optimization inactive!
  * IBD2    : length>=3.00cM [N>=150 / MAF>=0.010 / MDR<=0.500]

Initialization:
  * VCF/BCF scanning [N=2 / L=70040 / Reg=22] (0.53s)
  * VCF/BCF parsing [Hom=31.5% / Het=68.2% / Pha=9.031% / Mis=0.3%] (0.53s)
  * GMAP parsing [n=45329] (0.06s)
  * cM interpolation [s=17762 / i=52278] (0.01s)
  * PBWT indexing [l=2769] (0.00s)
  * HAP update (0.00s)
  * H2V transpose (0.00s)
  * IBD2 constraints [#inds=0 / #pairs=0] (0.01s)
  * PBWT phase sweep (0.01s)
  * Build genotype graphs [seg=31841] (0.00s)

Burn-in iteration [1/5]
  * V2H transpose (0.00s)
Segmentation fault (core dumped)

I just have two samples here. If I need the reference panel, which one is the best? Could you provide the link to download it?

Thanks!

odelaneau commented 3 years ago

Hi,

Two samples? Yes you definitely need a reference panel.

If you work with human data:

  1. Go for 1000 Genomes if non-european ancestry.
  2. Go for HRC if european ancestry.

Otherwise, there are plenty of other population specific panels, but not sure they are easily accessible.

Best,