grenaud / glactools

command-line tools for the management of genotype likelihoods and allele counts
http://grenaud.github.io/glactools/
GNU General Public License v3.0
28 stars 2 forks source link

plink(bed-bim-fam) to treemix #19

Open josephresearcher opened 3 years ago

josephresearcher commented 3 years ago

Hi I Need Convert plink(bed-bim-fam) to treemix input With This Command: glactools bplink2acf --fai human_g1k_v37.fasta.fai afghan | glactools acf2treemix - | gzip > treemix.gz

but error:

Cannot write to /dev/stdout Error: GlacParser tried to read 4 bytes but got 0

Data: https://evolbio.ut.ee/afghan/

grenaud commented 3 years ago

Hello! I managed to replicate the error using: glactools bplink2acf --fai human_g1k_v37.fasta.fai afghan > /dev/null

bplink2acf: WARNING: The reference allele between the EPO/FASTA (G) was not found in the the .bim file (found: C,T) at chr:pos 1:5732473 at line 1 rs771071 0 5732473 T C

So I looked at the USCS browser and it shows that rs771071 is a reference G morphing to A. Around this snp is GGC.

samtools faidx seems to agree:

1:5732472-5732474 GGC

Are you sure it was v37? not 38 for the bim file?

josephresearcher commented 3 years ago

thank you Dear Renaud I tried several files, unfortunately the problem persists

grenaud commented 3 years ago

This is not a bug, it is a safeguard feature to make sure that the consistency between the reference and the bim file is respected. How is it possible that this line is in your bim file with the wrong reference allele? It should be a G.

josephresearcher commented 3 years ago

I mentioned the link of these files for you What do you think is the solution? https://evolbio.ut.ee/ https://evolbio.ut.ee/afghan/

grenaud commented 3 years ago

The solution is to try to understand why there is a SNP without the reference allele being present in your .bim file and where the bases do not match UCSC. Go over the steps you used to generate the .bed, .bim and fam files and try to track the issue.