I seem to have a problem with loading data from vcf.
I have a sample set with SNP calls from a fungal organism. Each individual is haploid and sequenced individually. The loading doesn't produce an error, but also doesn't seem to generate a correct data object, because pcadapt throws an error.
As you can see in the attached small toy vcf there are no individuals nor snps with missing values. In the attached, file I have 20 samples and AC is always 1 or more and in a few it is, 20, so definitely no samples w/o snps.
Is this related to the data being haploid or am I overlooking something else here? And does this have nothing to do with my data being haploid?
x <- pcadapt(input = data, K = 2, ploidy = 1, pca.only = TRUE)
Error: Can't compute SVD.
Are there SNPs or individuals with missing values only?
You should use PLINK for proper data quality control. `
As stated in the documentation, we do not maintain the VCF format anymore.
Please use PLINK for conversion (to bed/bim/fam) and also for quality control.
I seem to have a problem with loading data from vcf.
I have a sample set with SNP calls from a fungal organism. Each individual is haploid and sequenced individually. The loading doesn't produce an error, but also doesn't seem to generate a correct data object, because pcadapt throws an error.
As you can see in the attached small toy vcf there are no individuals nor snps with missing values. In the attached, file I have 20 samples and AC is always 1 or more and in a few it is, 20, so definitely no samples w/o snps.
Is this related to the data being haploid or am I overlooking something else here? And does this have nothing to do with my data being haploid?
Thanks for your help,
Remco
CombinedSamples.np.filt.part.vcf.gz
> data <- read.pcadapt("CombinedSamples.np.filt.part.vcf", type = "vcf")
No variant got discarded. Summary:input file: CombinedSamples.np.filt.part.vcf
output file: /tmp/RtmpjLRmlq/file2851d6b60ab.pcadapt
number of individuals detected: 20
number of loci detected: 43
43 lines detected. 20 columns detected.
x <- pcadapt(input = data, K = 2, ploidy = 1, pca.only = TRUE)
Error: Can't compute SVD. Are there SNPs or individuals with missing values only? You should use PLINK for proper data quality control. `