odelaneau / shapeit5

Segmented HAPlotype Estimation and Imputation Tool
https://odelaneau.github.io/shapeit5/
MIT License
66 stars 9 forks source link

No variant found #88

Open niv280 opened 8 months ago

niv280 commented 8 months ago

Hi, I'm trying to use phase_common_static to phase relative small VCF using ref_panel, and I'm getting the "No variant found" error I tried to compare my VCF and the ref panel and it seem there are at least 1 line in the same position that appear in both files.

Here is example for my vcf: image

( I using fill tag to add AC before using phase_common)

Here is example for my ref_panel: image

Any advise why there is no variant?

RJHFMSTR commented 7 months ago

Hi niv280,

Can you share the shapeit5 command that you used, and the log file?

Thanks, Robin

niv280 commented 7 months ago

Hi, I think that I understood what the problem. When I removed the ALT column from NON_REF to some kind of different ALT, it worked. So it's looks the the tool not supported in NON_REF, that make sense?

michal-yoles commented 7 months ago

I also have the same problem. I am running the command: phase_common --input output.vcf.gz --output allchr6.phased.vcf --thread 8 --region 6 and get the error:

[SHAPEIT5] phase_common (jointly phase multiple common markers)

  • Author : Olivier DELANEAU, University of Lausanne
  • Contact : olivier.delaneau@gmail.com
  • Version : 5.1.1 / commit = f3ccf35 / release = 2024-02-27
  • Run date : 07/03/2024 - 12:03:05

Files:

  • Input : [output.vcf.gz]
  • Output : [allchr6.phased.vcf]
  • Output format : [bcf]

Parameters:

  • Seed : 15052011
  • Threads : 8 threads
  • MCMC : 15 iterations [5b + 1p + 1b + 1p + 1b + 1p + 5m]
  • PBWT : [window = 4cM / depth = auto / modulo = auto / mac = 5 / missing = 0.1]
  • HMM : [window = 4cM / Ne = 15000 / Constant recombination rate of 1cM per Mb]

Reading genotype data: [W::hts_idx_load3] The index file is older than the data file: output.vcf.gz.tbi

  • VCF/BCF scanning done (0.01s)
    • Variants [#sites=0 / region=6]

ERROR: No variants to be phased!

a snapshot from my VCF containing data from 130 individuals: image

What may be the problem and how can I solve it?

RJHFMSTR commented 7 months ago

Hi niv280,

Your ALT allele must be A,C,T or G for SNPs, or INDELS, but not "" (eg. your ref panel).

RJHFMSTR commented 7 months ago

Hi michal-yoles,

Try using region=chr6 instead of region=6.

Best, Robin

michal-yoles commented 7 months ago

Now I get a different error:

phase_common --input output.vcf.gz --output allchr6.phased.vcf --thread 8 --region chr6

[SHAPEIT5] phase_common (jointly phase multiple common markers)

  • Author : Olivier DELANEAU, University of Lausanne
  • Contact : olivier.delaneau@gmail.com
  • Version : 5.1.1 / commit = f3ccf35 / release = 2024-02-27
  • Run date : 07/03/2024 - 15:23:01

Files:

  • Input : [output.vcf.gz]
  • Output : [allchr6.phased.vcf]
  • Output format : [bcf]

Parameters:

  • Seed : 15052011
  • Threads : 8 threads
  • MCMC : 15 iterations [5b + 1p + 1b + 1p + 1b + 1p + 5m]
  • PBWT : [window = 4cM / depth = auto / modulo = auto / mac = 5 / missing = 0.1]
  • HMM : [window = 4cM / Ne = 15000 / Constant recombination rate of 1cM per Mb]

Reading genotype data:

ERROR: AC field is needed in file

But my VCF do contain AC (I added it after getting this error message before, it can be seen in the previous screenshot)

RJHFMSTR commented 7 months ago

Hi michal-yoles,

You also need to add the AN tag. AC and AN are used to compute allele frequency. You can easily fill those tags, eg. using bcftools +fill-AN-AC.

Cheers

michal-yoles commented 6 months ago

It worked! Now I have a new error :) I ran the command: phase_common --input NA_AC_filteredChr6All.vcf.gz --pedigree All_CH_FA_MO.fam --output allch r6.phased.vcf --thread 8 --region chr6 and encountered the following error:

ERROR: Fail to index file [allchr6.phased.vcf]

What does it means? Although an output VCF is created, upon inspection, I notice cases of Mendel inconsistency. for example my fam file looks like this: image the VCF contain a snp that looks like this:

chr6 398775 chr6_398775_G_A G A . . AC=32;AN=272 GT 0|1 0|1 0|1 0|1 0|0 1|0 0|0 0|1

where the two parents has haplotype of 0|0 and 1|0 (the sixth and the seventh) while the children has an haplotype of 0|1 which is not possible according to mendelian rules. Does it related to the error I encountered?