sirselim / bootNet

bootNet is a wrapper for the fantastic glmnet function - it brings bootstrapping and parallel processing to the elastic-net framework.
0 stars 0 forks source link

bootNet ('strapping the elastic-net)

bootNet is a wrapper for the fantastic glmnet R package - it brings bootstrapping and parallel processing to the elastic-net framework.

This script was originally designed to analyse methylation data in the form of beta matrices. The beta matrix must have CpG sites as rows and samples as columns for bootNet to work.

With a recent update bootNet is now data agnostic and has been tested with SNP and expression data (as well as methylation).

Updates

2020-07-30

Added a directory (here) which includes code and documentation contributed by Sean Burnard which is used in a recent manuscript exploring genetic associations in multiple sclerosis. DOI and citation will be added shortly.

Version: 0.1.2.0

Version: 0.1.1.1

WARNING: be aware of the amount of available system RAM when using bootNet.parallel(), if the data set is large even running across 4-8 cores will quickly utalise many GB of RAM - you have been warned! Some real-world usage metrics across different Linux systems are provided below.

Example usage:

What do you need to provide the bootNet script?

bootNet()

bootNet(data = x, outcome = y, Alpha = 0.1, iter = 1000, sub_sample = 0.666, sampleID = sampleID, method = method)

bootNet.parallel()

bootNet.parallel(data = x, outcome = y, Alpha = 0.1, iter = 1000, sub_sample = 0.666, cores = 4, sampleID = sampleID)

To do list

Performance expectations

A few test examples showing systems and performance metrics.

Experiment 1: 24 samples run on Illumina 450K methylation array (24 columns, 446280 rows). Phenotype was quantitative (age).

System used (Dell laptop: Precision M4800):

Experiment 2: 75 samples run on Illumina 450K methylation array (75 columns, 445998 rows). Phenotype was quantitative (glucose).

System used (Dell workstation):

Experiment 3: 105 samples run on Illumina 450k methylation array (105 columns, 380777 rows). Phenotype was qualitative (BMI separated into two categories)

System used (Z170-D3H)

Dependencies

There are 3 R packages required by bootNet: