Closed ewmorr closed 3 years ago
Do you really have a bed file here?
You don't seem to have used --make-bed
.
Thanks for your quick response. Sorry I left that out. The full command was
plink --vcf test_dat.vcf --recode12 --mind 0.1 --geno 0.1 --out test_dat --allow-extra-chr --make-bed --missing-genotype 9
I don't think you need --recode12
and --missing-genotype 9
when making bed files.
Are you sure you're reading the bed file with pcadapt, and not some text file instead?
With regards to --missing-gentoype 9
the pcadapt manual Getting Started page says "missing values should be encoded by a single character (e.g. 9) different from 0, 1 or 2."
Here's how I'm reading the file:
require(pcadapt)
path_to_file <- "test_dat.bed"
filename <- read.pcadapt(path_to_file, type = "bed")
Calling filename
in R then gives
1] "test_dat.bed"
attr(,"n")
[1] 16
attr(,"p")
[1] 70
attr(,"class")
[1] "pcadapt_bed"
So looks like pcadapt recognizes a bed file? I would note that I can also read a PED (i.e. a text file), I just get a warning about the PED conversion to BED within pcadapt is deprecated...
Nevertheless, I'm probably doing something silly, so if the answer isn't obvious I can continue to do some digging.
That first sentence refers to when you're using a text file as input.
Yes, it seems that you're reading a bed file indeed. Is it normal that it is so small though?
You can use pcadapt::bed2matrix()
to convert it to a matrix and probably see from there what is the problem.
Hi again,
Thanks for that function, that set me on the right track. I had set ploidy = 1
in pcadapt
but I have diploid variant calls from the pipeline I'm using (despite being a haploid variant caller). Rookie mistake. pcadapt
now runs with no SVD error.
Thanks! Eric
Hello,
I'm attempting to run a test dataset of 81 variants and 71 genotypes through pcadapt. These are VCF v4.2 data that have been filtered using plink v1.9
--mind 0.1 --geno 0.1 --recode12 --missing-genotype 9
to produce BED files for loading to pcadapt v4.3.3 in R v3.6.2. The data are from a de novo assembled fungus, and so first six columns of each line of the associated PED files read assampleName sampleName 0 0 0 -9
(with sampleName assigned to unique codes for each sample).I get no errors on reading the data, but on attempting to run pcadapt I get the following:
I'm new to population genomics analysis, but it seems clear to me that this is not a missing data issue? Is there a file formatting issue here? (I do have .bim and .fam files contained in the same directory as the .bed).
Thanks for any pointers.
Best, Eric