I initially added this issue as a comment on a closed issue #77, but I'm re-adding it here as a new issue.
The code producing the error is as follows:
path_to_file <- ("/Users/JMAC/Library/CloudStorage/Dropbox/Research/Humboldt/CCGA_full_sequencing/WG_outlier_analysis/WG_pcadapt/downsampled_10X/10x_TOA_only_filtered_SNPs_all_2.bed")
filename <- read.pcadapt(path_to_file, type = "bed")
x <- pcadapt(input=filename, K=20)
Error: Can't compute SVD.
Are there SNPs or individuals with missing values only?
You should use PLINK for proper data quality control.
I wonder if the issue might be the sample number, as mentioned in issue #66. The dataset in question has an n of 14.
That said, I have 3 datasets that ran successfully with pcadapt that have and n of 41, 39 and 24.
I produced the input .bed file using plink2, as I did for the other 3 datasets.
At first I used the following code:
plink2 --vcf 10x_TOA_only_filtered_SNPs_all.vcf --make-bed --allow-extra-chr --out 10x_TOA_only_filtered_SNPs_all
Then, based on the feedback in issue #66, I included --mind 0.4 and --geno 0.5 parameters:
plink2 --vcf 10x_TOA_only_filtered_SNPs_all.vcf --make-bed --allow-extra-chr --mind 0.5 --geno 0.5 --out 10x_TOA_only_filtered_SNPs_all_2
Both resulting .bed files produced the same error in pcadapt.
To see if you could potentially reproduce the error I'm providing the following files:
Dataset n = 14: .vcf file (used as input to plink2), and .bed file (produced in plink2, used as input to pcadapt)
R script for n = 14 dataset
R session info:
If it might help, for comparison, the other 3 datasets that ran successfully in pcadapt, are also available via the following links. The difference between the dataset in question (n = 14) and these (n = 41, 39, and 24 respectively), is that the dataset in question was downsampled so that all samples have the same coverage, in this case, 10x. The other 3 datasets were either not downsampled (n = 41), downsampled to 2x (n = 39), or downsampled to 5x (n = 24). The variation in the number of samples per dataset is because the coverage among samples ranged from 1x - 27x, so not all samples had a high enough coverage to be downsampled to the appropriate coverage level.
Hi there,
I initially added this issue as a comment on a closed issue #77, but I'm re-adding it here as a new issue.
The code producing the error is as follows:
I wonder if the issue might be the sample number, as mentioned in issue #66. The dataset in question has an n of 14.
That said, I have 3 datasets that ran successfully with pcadapt that have and n of 41, 39 and 24.
I produced the input .bed file using plink2, as I did for the other 3 datasets.
At first I used the following code:
plink2 --vcf 10x_TOA_only_filtered_SNPs_all.vcf --make-bed --allow-extra-chr --out 10x_TOA_only_filtered_SNPs_all
Then, based on the feedback in issue #66, I included
--mind 0.4
and--geno 0.5
parameters:plink2 --vcf 10x_TOA_only_filtered_SNPs_all.vcf --make-bed --allow-extra-chr --mind 0.5 --geno 0.5 --out 10x_TOA_only_filtered_SNPs_all_2
Both resulting .bed files produced the same error in pcadapt.
To see if you could potentially reproduce the error I'm providing the following files: Dataset n = 14: .vcf file (used as input to plink2), and .bed file (produced in plink2, used as input to pcadapt) R script for n = 14 dataset R session info:
If it might help, for comparison, the other 3 datasets that ran successfully in pcadapt, are also available via the following links. The difference between the dataset in question (n = 14) and these (n = 41, 39, and 24 respectively), is that the dataset in question was downsampled so that all samples have the same coverage, in this case, 10x. The other 3 datasets were either not downsampled (n = 41), downsampled to 2x (n = 39), or downsampled to 5x (n = 24). The variation in the number of samples per dataset is because the coverage among samples ranged from 1x - 27x, so not all samples had a high enough coverage to be downsampled to the appropriate coverage level.
Dataset n = 41 (not downsampled, coverage ranges from 1x - 27x): .vcf file, .bed file, R script for pcadapt Dataset n = 39 (downsampled to 2x coverage): .vcf file, .bed file, R script for pcadapt Dataset n = 24 (downsampled to 5x coverage): .vcf file, .bed file, R script for pcadapt
I'm happy to provide further information as needed.
Thanks in advance for your feedback.
Best, Jilda
Originally posted by @jcaccavo in https://github.com/bcm-uga/pcadapt/issues/77#issuecomment-1697058504