bioinformatics-ptp / detectRUNS

detectRuns: a R Package for Runs of Homozygosity and Runs of Heterozygosity
8 stars 4 forks source link

Function readPOPCpp() doesn't work with tab-separated input ped file #15

Open filippob opened 4 years ago

filippob commented 4 years ago

This problem is a tab-separated issue similar to #10. The function readPOPCpp() reads the genotype file (Plnk ped file) to extract information on sample IDs and the group/population these belong to. This is a non-exported function is used within the function snpInsideRunsCpp() to prepare data for plots (plot_SnpsInRuns and plot_manhattanRuns): it works with space-separated ped files, but not with tab-separated ped files, and in this latter case the proportion of SNPs in runs is not computed correctly and goes to Infinity. Perhaps this can be fixed for next release of the package?

bunop commented 4 years ago

The same applies for genotype <- (strsplit(oneLine, " ")) in slidingRUNS.run and for genotype <- (strsplit(oneLine, " ")) in consecutiveRUNS.run: the supported format for .map and .ped is space separated values. Did we declare this somewhere in documentation?

filippob commented 4 years ago

We did not specify this in the documentation: therefore we either modify these functions to accept also tab-separated ped files, or make this clear in the manual and vignette.

bunop commented 4 years ago

So the problem is deal with tab separated files, we already discussed this in #10 please refer to #10 for any TAB related issue