QuantGen / BEDMatrix

A matrix-like wrapper around PLINK .bed files
Other
18 stars 2 forks source link
bed cran genetics genomics plink plink2 r r-pkg

BEDMatrix

CRAN status

BEDMatrix is an R package that provides a matrix-like wrapper around .bed, one of the genotype/phenotype file formats of PLINK, the whole genome association analysis toolset. BEDMatrix objects are created in R by simply providing the path to a .bed file and once created, they behave similarly to regular matrices with the advantage that genotypes are retrieved on demand without loading the entire file into memory. This allows handling of very large files with limited use of memory.

This package is deliberately kept simple. For computational methods that use BEDMatrix check out the BGData package.

Example

This example uses a dummy .bed file that is bundled with this R package. It was generated using plink --dummy 500 1000 0.02 acgt --seed 4711 --out example with PLINK 1.90 beta 3.452.

To get the path to the example .bed file (system.file finds the full file names of files in packages and is only used to find the example data):

path <- system.file("extdata", "example.bed", package = "BEDMatrix")

To wrap the example .bed file in a BEDMatrix object:

m <- BEDMatrix(path)
#> Extracting number of samples and rownames from example.fam...
#> Extracting number of variants and colnames from example.bim...

To get the dimensions of the BEDMatrix object:

dim(m)
#> [1] 50 1000

To extract a subset of the BEDMatrix object:

m[1:3, 1:5]
#>           snp0_A snp1_C snp2_G snp3_G snp4_G
#> per0_per0      0      1      1      1      0
#> per1_per1      1      1      1      1     NA
#> per2_per2      1      0      0      2      0

Installation

Install the stable version from CRAN:

install.packages("BEDMatrix")

Alternatively, install the development version from GitHub:

# install.packages("remotes")
remotes::install_github("QuantGen/BEDMatrix")

Documentation

Further documentation can be found on RDocumentation.

Contributing