bcm-uga / pcadapt

Performing highly efficient genome scans for local adaptation with R package pcadapt v4
https://bcm-uga.github.io/pcadapt
39 stars 10 forks source link

as.matrix(read.table) still not working, even with other data types #25

Closed Ella-Bowles closed 5 years ago

Ella-Bowles commented 6 years ago

Hi Michael et al,

I'm wondering if my issues with PCAdapt have to do with updates to R or R studio. I've just been running PCAdapt again with a different dataset, this time with a vcf file that I know works, because I ran everything successfully before I ran a bunch of updates, but i am getting the same issues now as I was with the files generated using radiator that I wrote issues #23 for. I'll go through my code and the problematic steps here.

path_to_file <- "D:/Documents/PhDEcologyEvolution/Data/GBS_2013_run/stacks_run_08/poplns8c10_NoOdds-mafFiltOff-USE-jobid5809653/stickl8c10NoOdds_includingOutlrs/stickl8c10NoOdds.vcf" stickl8c10NoOdds <- read.pcadapt(path_to_file,type="vcf") No variant got discarded. Summary:

- input file:               D:/Documents/PhDEcologyEvolution/Data/GBS_2013_run/stacks_run_08/poplns8c10_NoOdds-mafFiltOff-USE-jobid5809653/stickl8c10NoOdds_includingOutlrs/stickl8c10NoOdds.vcf
- output file:              C:\Users\Admin\AppData\Local\Temp\RtmpyKk4pz\file50848f316ca.pcadapt

- number of individuals detected:   180
- number of loci detected:      4038

4038 lines detected. 180 columns detected.

Things first started going wrong when PCAdapt started saving the output PCAdapt file to temp directories (as shown above) on my C drive. It used to save them to my working directory, and they were named using the same conventions as my vcf files, not with a strange temp name.

Then this next part works fine x <- pcadapt(stickl8c10NoOdds,K=20) plot(x,option="screeplot")

And I run the following without complaint data <- as.matrix(read.table("file50848f316ca.pcadapt"))

Then I run and get the following however

x<-pcadapt(data,K=3) Error in UseMethod("pcadapt") : no applicable method for 'pcadapt' applied to an object of class "c('matrix', 'integer', 'numeric')"

I can still find outliers if I use the "stickl8c10NoOdds, but I need to be able to get the matrix because I want to look at structure of the populations both with and without outliers.

I have uninstalled PCAdapt and re-installed it, and also uninstalled R v3.51 and tried installations of 3.4.3 and 3.4.4, but none of these things seem to be working. I'm at a loss for why things stopped working after I updated, but then haven't resolved when I downgraded.

Relevant data files attached. stickl8c10NoOdds.zip

Any help would be much appreciated.

Ella

privefl commented 6 years ago

There is no issue here.

Software pcadapt has long been consistent in the sense that you need to use read.pcadapt() and then use pcadapt() for any type of data. When you do read.pcadapt() on a matrix, it doesn't do much but you still has to use it before using pcadapt().

For the first comment on the ".pcadapt" temp file, it is also normal. Starting with pcadapt 4.0, the prefered format is now ".bed", which is more compact and is more efficiently read than text files, and is also a standard format (because of PLINK). So, every ".pcadapt" file is converted to a ".bed" file. So, basically you get input_vcf -> temp_pcapapt -> output_bed. But, as stated in the documentation, you should prefer using PLINK to convert your vcf file to bed (and to do quality control in the meantime).

mblumuga commented 6 years ago

To complete @privefl answer, you should use data <- as.matrix(read.table("file50848f316ca.pcadapt")) toto <- read.pcadapt(data,type="pcadapt")