bcm-uga / pcadapt

Performing highly efficient genome scans for local adaptation with R package pcadapt v4
https://bcm-uga.github.io/pcadapt
37 stars 10 forks source link

Problem loading haploid data? #37

Closed remco-stam closed 5 years ago

remco-stam commented 5 years ago

I seem to have a problem with loading data from vcf.

I have a sample set with SNP calls from a fungal organism. Each individual is haploid and sequenced individually. The loading doesn't produce an error, but also doesn't seem to generate a correct data object, because pcadapt throws an error.

As you can see in the attached small toy vcf there are no individuals nor snps with missing values. In the attached, file I have 20 samples and AC is always 1 or more and in a few it is, 20, so definitely no samples w/o snps.

Is this related to the data being haploid or am I overlooking something else here? And does this have nothing to do with my data being haploid?

Thanks for your help,

Remco

CombinedSamples.np.filt.part.vcf.gz

> data <- read.pcadapt("CombinedSamples.np.filt.part.vcf", type = "vcf") No variant got discarded. Summary:

43 lines detected. 20 columns detected.

x <- pcadapt(input = data, K = 2, ploidy = 1, pca.only = TRUE) Error: Can't compute SVD. Are there SNPs or individuals with missing values only? You should use PLINK for proper data quality control. `

privefl commented 5 years ago

As stated in the documentation, we do not maintain the VCF format anymore. Please use PLINK for conversion (to bed/bim/fam) and also for quality control.