Closed SoniaAndrade closed 6 years ago
Hello
Thank you for attaching your plink dataset -- it was very helpful in figuring out what the issue is:
The bim file contains unrecognized chromosome codes: e.g. "scaffold_3" instead of "1". Plink can process this dataset if we specify the --allow-extra-chr
option. However, (it doesn't seem that) libplinkio can handle non-standard formatting.
A simple way to deal with this is to modify a copy of the bim file. For example, on the command line we can "assing" all SNPs to chromosome 1 for the purpose of computing genetic dissimilarities with bed2diffs:
cp datapath.bed newdatapath.bed
cp datapath.fam newdatapath.fam
awk '{print 1,$2,$3,$4,$5,$6}' datapath.bim > newdatapath.bim
Hopefully,
./bed2diffs_v1 --bfile newdatapath
will execute without errors.
Thanks for raising this issue. I've added a comment to the bed2diffs error message to check that the plink dataset has a standard format if there is a libplinkio error.
Thanks, it worked perfectly! I had to use scaffold_ as plink cannot process more than ~90 chromosomes and I am dealing with non model organism GBS data. Thanks again!
Dear Dr Petkova,
Using bed2diffs on my Plink files (.bed, .fam and .bim, all at the same directory), I got the following message:
./bed2diffsv1 --bfile palma2 --nthreads 4 Compute the average genetic differences according to: Dij = (1/|Mij|) sum{m in Mij} (z{im} - z{jm})^2 where Mij is the set of SNPs where both i and j are called
[Data::getsize] Error opening plink files palma2.[bed/bim/fam]
Please see the input files attached. I am wondering what might be wrong with the dataset, which is extensive (it's a product from GBS data) but seems alright. Could you please help me with that? Thanks test.zip