Closed cyadrogarcia closed 2 years ago
You are saying that they do not get this error, and get a proper result, on the same data? Do you have the same package version as they do?
Any update on this?
Hi there,
I also received this error when trying to run pcadapt on a particular dataset. When running pcadapt using the same R script with 3 other datasets that derived from the same samples, I had no issue. The code producing the error is as follows:
path_to_file <- ("/Users/JMAC/Library/CloudStorage/Dropbox/Research/Humboldt/CCGA_full_sequencing/WG_outlier_analysis/WG_pcadapt/downsampled_10X/10x_TOA_only_filtered_SNPs_all_2.bed")
filename <- read.pcadapt(path_to_file, type = "bed")
x <- pcadapt(input=filename, K=20)
Error: Can't compute SVD.
Are there SNPs or individuals with missing values only?
You should use PLINK for proper data quality control.
I wonder if the issue might be the sample number, as mentioned in issue #66. The dataset in question has an n of 14.
That said, the other 3 datasets that ran successfully with pcadapt, have and n of 41, 39 and 24.
I produced the input .bed file using plink2, as I did for the other 3 datasets.
At first I used the following code:
plink2 --vcf 10x_TOA_only_filtered_SNPs_all.vcf --make-bed --allow-extra-chr --out 10x_TOA_only_filtered_SNPs_all
Then, based on the feedback in issue #66, I included --mind 0.4
and --geno 0.5
parameters:
plink2 --vcf 10x_TOA_only_filtered_SNPs_all.vcf --make-bed --allow-extra-chr --mind 0.5 --geno 0.5 --out 10x_TOA_only_filtered_SNPs_all_2
Both resulting .bed files produced the same error in pcadapt.
To see if you could potentially reproduce the error I'm providing the following files: Dataset n = 14: .vcf file (used as input to plink2), and .bed file (produced in plink2, used as input to pcadapt) R script for n = 14 dataset R session info:
─ Session info ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
setting value
version R version 4.2.2 (2022-10-31)
os macOS Monterey 12.6
system x86_64, darwin17.0
ui RStudio
language (EN)
collate en_US.UTF-8
ctype en_US.UTF-8
tz Europe/Paris
date 2023-08-29
rstudio 2023.03.1+446 Cherry Blossom (desktop)
pandoc NA
─ Packages ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
package * version date (UTC) lib source
ade4 * 1.7-22 2023-02-06 [1] CRAN (R 4.2.0)
adegenet * 2.1.10 2023-01-26 [1] CRAN (R 4.2.2)
ape 5.7-1 2023-03-13 [1] CRAN (R 4.2.0)
cachem 1.0.7 2023-02-24 [1] CRAN (R 4.2.0)
callr 3.7.3 2022-11-02 [1] CRAN (R 4.2.0)
cli 3.6.1 2023-03-23 [1] CRAN (R 4.2.0)
cluster 2.1.4 2022-08-22 [1] CRAN (R 4.2.2)
colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.2.0)
crayon 1.5.2 2022-09-29 [1] CRAN (R 4.2.0)
data.table 1.14.8 2023-02-17 [1] CRAN (R 4.2.0)
devtools 2.4.5 2022-10-11 [1] CRAN (R 4.2.0)
digest 0.6.31 2022-12-11 [1] CRAN (R 4.2.0)
dplyr 1.1.1 2023-03-22 [1] CRAN (R 4.2.0)
ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.2.0)
fansi 1.0.4 2023-01-22 [1] CRAN (R 4.2.0)
fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.2.0)
fs 1.6.1 2023-02-06 [1] CRAN (R 4.2.0)
generics 0.1.3 2022-07-05 [1] CRAN (R 4.2.0)
ggplot2 3.4.2 2023-04-03 [1] CRAN (R 4.2.0)
glue 1.6.2 2022-02-24 [1] CRAN (R 4.2.0)
gtable 0.3.3 2023-03-21 [1] CRAN (R 4.2.0)
hms 1.1.3 2023-03-21 [1] CRAN (R 4.2.0)
htmltools 0.5.5 2023-03-23 [1] CRAN (R 4.2.0)
htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.2.0)
httpuv 1.6.9 2023-02-14 [1] CRAN (R 4.2.0)
igraph 1.4.2 2023-04-07 [1] CRAN (R 4.2.0)
later 1.3.0 2021-08-18 [1] CRAN (R 4.2.0)
lattice 0.21-8 2023-04-05 [1] CRAN (R 4.2.0)
lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.2.0)
magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.2.0)
MASS 7.3-58.3 2023-03-07 [1] CRAN (R 4.2.0)
Matrix 1.5-4 2023-04-04 [1] CRAN (R 4.2.0)
memoise 2.0.1 2021-11-26 [1] CRAN (R 4.2.0)
memuse 4.2-3 2023-01-24 [1] CRAN (R 4.2.2)
mgcv 1.8-42 2023-03-02 [1] CRAN (R 4.2.0)
mime 0.12 2021-09-28 [1] CRAN (R 4.2.0)
miniUI 0.1.1.1 2018-05-18 [1] CRAN (R 4.2.0)
munsell 0.5.0 2018-06-12 [1] CRAN (R 4.2.0)
nlme 3.1-162 2023-01-31 [1] CRAN (R 4.2.0)
OutFLANK * 0.2 2023-01-20 [1] Github (whitlock/OutFLANK@e502e82)
pcadapt * 4.3.3 2020-05-05 [1] CRAN (R 4.2.0)
permute 0.9-7 2022-01-27 [1] CRAN (R 4.2.0)
pillar 1.9.0 2023-03-22 [1] CRAN (R 4.2.0)
pinfsc50 1.2.0 2020-06-03 [1] CRAN (R 4.2.0)
pkgbuild 1.4.0 2022-11-27 [1] CRAN (R 4.2.0)
pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.2.0)
pkgload 1.3.2 2022-11-16 [1] CRAN (R 4.2.0)
plyr 1.8.8 2022-11-11 [1] CRAN (R 4.2.0)
prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.2.0)
processx 3.8.1 2023-04-18 [1] CRAN (R 4.2.2)
profvis 0.3.7 2020-11-02 [1] CRAN (R 4.2.0)
promises 1.2.0.1 2021-02-11 [1] CRAN (R 4.2.0)
ps 1.7.5 2023-04-18 [1] CRAN (R 4.2.2)
purrr 1.0.1 2023-01-10 [1] CRAN (R 4.2.0)
qvalue * 2.30.0 2022-11-01 [1] Bioconductor
R6 2.5.1 2021-08-19 [1] CRAN (R 4.2.0)
Rcpp 1.0.10 2023-01-22 [1] CRAN (R 4.2.0)
readr 2.1.4 2023-02-10 [1] CRAN (R 4.2.0)
remotes 2.4.2 2021-11-30 [1] CRAN (R 4.2.0)
reshape2 1.4.4 2020-04-09 [1] CRAN (R 4.2.0)
rlang 1.1.0 2023-03-14 [1] CRAN (R 4.2.0)
RSpectra 0.16-1 2022-04-24 [1] CRAN (R 4.2.0)
rstudioapi 0.14 2022-08-22 [1] CRAN (R 4.2.0)
scales 1.2.1 2022-08-20 [1] CRAN (R 4.2.0)
seqinr 4.2-30 2023-04-05 [1] CRAN (R 4.2.0)
sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.2.0)
shiny 1.7.4 2022-12-15 [1] CRAN (R 4.2.0)
stringi 1.7.12 2023-01-11 [1] CRAN (R 4.2.0)
stringr 1.5.0 2022-12-02 [1] CRAN (R 4.2.0)
tibble 3.2.1 2023-03-20 [1] CRAN (R 4.2.0)
tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.2.0)
tzdb 0.3.0 2022-03-28 [1] CRAN (R 4.2.0)
urlchecker 1.0.1 2021-11-30 [1] CRAN (R 4.2.0)
usethis 2.1.6 2022-05-25 [1] CRAN (R 4.2.0)
utf8 1.2.3 2023-01-31 [1] CRAN (R 4.2.2)
vcfR * 1.14.0 2023-02-10 [1] CRAN (R 4.2.0)
vctrs 0.6.1 2023-03-22 [1] CRAN (R 4.2.0)
vegan 2.6-4 2022-10-11 [1] CRAN (R 4.2.0)
viridisLite 0.4.1 2022-08-22 [1] CRAN (R 4.2.0)
xtable 1.8-4 2019-04-21 [1] CRAN (R 4.2.0)
[1] /Library/Frameworks/R.framework/Versions/4.2/Resources/library
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
If it might help, for comparison, the other 3 datasets that ran successfully in pcadapt, are also available via the following links. The difference between the dataset in question (n = 14) and these (n = 41, 39, and 24 respectively), is that the dataset in question was downsampled so that all samples have the same coverage, in this case, 10x. The other 3 datasets were either not downsampled (n = 41), downsampled to 2x (n = 39), or downsampled to 5x (n = 24). The variation in the number of samples per dataset is because the coverage among samples ranged from 1x - 27x, so not all samples had a high enough coverage to be downsampled to the appropriate coverage level.
Dataset n = 41 (not downsampled, coverage ranges from 1x - 27x): .vcf file, .bed file, R script for pcadapt Dataset n = 39 (downsampled to 2x coverage): .vcf file, .bed file, R script for pcadapt Dataset n = 24 (downsampled to 5x coverage): .vcf file, .bed file, R script for pcadapt
I'm happy to provide further information as needed.
Thanks in advance for your feedback.
Best, Jilda
@jcaccavo Thanks for the detailed description of the issue, and providing data to reproduce it.
I'll look into it.
The issue is that you have K > N. This is not possible to get more PCs than the number of individuals.
I've pushed a new version of the package that should provide a more helpful error message in that case.
Ugh duh! I'm sorry to have bothered you with this. I was focused on the fact that, after running the diagnostic plots, I was only testing for values of k 1 - 11, which make sense for my dataset. But indeed, for the diagnostic scree and score plots, I was using a standard value of k = 20 to see the impact of k value on the datasets generally; of course, with my dataset with an n of 14, this was causing issues.. Thanks for helping me see that!
When trying to run the pcadapt function with the tutorial I receive the following error "Error: Can't compute SVD. Are there SNPs or individuals with missing values only? You should use PLINK for proper data quality control."
Apparently is a problem related to my RStudio installation, since in the computers of other colleagues it works. Do you know what migth be causing the problem