Closed geneanalyst closed 6 years ago
In eigenstrat format the geno file contains lines in which the number of characters is the same as the number of samples (individuals) Here you have a line with 478 characters but 5253 samples.
By the way, I don't recommend eigenstrat format for such large datasets. Use convertf to make packed format. ADMIXTOOLS will then run much faster
Nick
On Thu, Jun 7, 2018 at 12:24 PM, geneanalyst notifications@github.com wrote:
I have previously run qpdstat with no problem. Now with a new dataset I got the following error:
fatalx: (ineigenstrat) mismatch line length 478 5253 Aborted (core dumped)
And the output file contains this error:
/home/d/ADMIXTOOLS1/qpDstat: parameter file: pardstat1 THE INPUT PARAMETERS
PARAMETER NAME: VALUE
indivname: MASTER_HiCov27.ind snpname: MASTER_HiCov27.snp genotypename: MASTER_HiCov27.geno popfilename: list_dstat qpDstat version: 711
(ineigenstrat) bad line 1019420 ::1121212121110211
Any ideas as to what the problem is?
Thanks
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/DReichLab/AdmixTools/issues/38, or mute the thread https://github.com/notifications/unsubscribe-auth/AQn_hzLXJtvJFvJg6RyUX_Lejb7L7l8eks5t6VOxgaJpZM4Ueuls .
Any idea what may be causing this. I converted from plink map and ped files.
Can I go from plink map, ped, pedind ( pedind is basically a fam file with column 6 a mirror of column 1) to packed format using the following par file:
genotypename: SASIA.ped snpname: SASIA.map indivname: SASIA.pedind outputformat: PACKEDPED genotypeoutname: SASIA.geno snpoutname: SASIA.snp indivoutname: SASIA.ind familynames: YES
outputformat is in my usage lower case so you should be able to leave out the outputformat line. But why is your output then in eigenstrat format?
Possibilities: 1) system trouble (file too big and truncated?) 2) misformat of your PLINK files??
By the way you can run convertf with input files SASIA.geno etc and no output files. In effect this tests if your files are well-formed.
Nick
On Thu, Jun 7, 2018 at 4:19 PM, geneanalyst notifications@github.com wrote:
Any idea what may be causing this. I converted from plink map and ped files.
Can I go from plink map, ped, pedind ( pedind is basically a fam file with column 6 a mirror of column 1) to packed format using the following par file:
genotypename: SASIA.ped snpname: SASIA.map indivname: SASIA.pedind outputformat: PACKEDPED genotypeoutname: SASIA.geno snpoutname: SASIA.snp indivoutname: SASIA.ind familynames: YES
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/DReichLab/AdmixTools/issues/38#issuecomment-395551425, or mute the thread https://github.com/notifications/unsubscribe-auth/AQn_h4eMGWVQl9lxewQjDbZG8041XwvHks5t6Yq0gaJpZM4Ueuls .
I converted using the above par file. Then ran qpdstat. Now I have this error:
/home/d/ADMIXTOOLS1/qpDstat -p pardstat1 > dstat_SIS.txt fatalx: bad chrom: rs3094315 Aborted (core dumped)
The .snp file looks fine to me:
1 rs3094315 0.020130 752566 A G 1 rs12124819 0.020242 776546 A G 1 rs28765502 0.022137 832918 T C 1 rs7419119 0.022518 842013 T G
Oh! I think I realized your trouble. outputformat packedped is packed plink. The default and what I recommend is packed ancestrymap. and your .snp file is in PLINK format NOT reichlab format (see examples). Rerun with no outputformat set and observe that the snp file now begins rs3094315 1 ... that is with columns flipped. I've thought about making my software look at the file contents to determine format, but for now a .snp file must be in Reich lab format.
Nick
On Thu, Jun 7, 2018 at 6:07 PM, geneanalyst notifications@github.com wrote:
I converted using the above par file. Then ran qpdstat. Now I have this error:
/home/d/ADMIXTOOLS1/qpDstat -p pardstat1 > dstat_SIS.txt fatalx: bad chrom: rs3094315 Aborted (core dumped)
The .snp file looks fine to me:
1 rs3094315 0.020130 752566 A G 1 rs12124819 0.020242 776546 A G 1 rs28765502 0.022137 832918 T C 1 rs7419119 0.022518 842013 T G
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/DReichLab/AdmixTools/issues/38#issuecomment-395580871, or mute the thread https://github.com/notifications/unsubscribe-auth/AQn_h59B_PhO-BJBxuRZ7okoYrAkiB6Rks5t6aQIgaJpZM4Ueuls .
Converted without outputformat set. Problem solved. Thanks.
I have previously run qpdstat with no problem. Now with a new dataset I got the following error:
fatalx: (ineigenstrat) mismatch line length 478 5253 Aborted (core dumped)
And the output file contains this error:
/home/d/ADMIXTOOLS1/qpDstat: parameter file: pardstat1
THE INPUT PARAMETERS
PARAMETER NAME: VALUE
indivname: MASTER_HiCov27.ind snpname: MASTER_HiCov27.snp genotypename: MASTER_HiCov27.geno popfilename: list_dstat
qpDstat version: 711
(ineigenstrat) bad line 1019420 ::1121212121110211
Any ideas as to what the problem is?
Thanks