clbustos / distribution

Statistical Distributions multi library wrapper. Uses Ruby by default and C (statistics2/GSL) or Java extensions where available.
Other
141 stars 52 forks source link

Distribution

Build Status Code Climate

Distribution is a gem with several probabilistic distributions. Pure Ruby is used by default, C (GSL) or Java extensions are used if available. Some facts:

The following table lists the available distributions and the methods available for each one. If a field is marked with an x, that distribution doesn't have that method implemented.

Distribution PDF CDF Quantile RNG Mean Mode Variance Skewness Kurtosis Entropy
Uniform x x x x x x x x x x
Normal x x x x x x x x x x
Lognormal x x x x x x x x
Bivariate Normal x x x x x x x x
Exponential x x x x x x x x
Logistic x x x x x x x x
t-Student x x x x x x x x
Chi Square x x x x x x x x
Fisher-Snedecor x x x x x x x x
Beta x x x x x x x x
Gamma x x x x x x x x
Weibull x x x x x x x x
Binomial x x x x x x x x
Poisson x x x x x x x x
Hypergeometric x x x x x x x x

Installation

$ gem install distribution

You can install GSL for better performance:

After successfully installing the library:

$ gem install rb-gsl

Examples

You can find automatically generated documentation on RubyDoc.

# Returns Gaussian PDF for x.
pdf = Distribution::Normal.pdf(x)

# Returns Gaussian CDF for x.
cdf = Distribution::Normal.cdf(x)

# Returns inverse CDF (or p-value) for x.
pv = Distribution::Normal.p_value(x)

# API.

# You would normally use the following
p = Distribution::T.cdf(x)

# to get the cumulative probability of `x`. However, you can also:

include Distribution::Shorthand
tdist_cdf(x)

API Structure

Distribution::<name>.(cdf|pdf|p_value|rng)

On discrete distributions, exact Ruby implementations of pdf, cdf and p_value could be provided, using

  Distribution::<name>.exact_(cdf|pdf|p_value)

module Distribution::Shorthand provides (you guess?) shortands method to call all methods

  <Distribution shortname>_(cdf|pdf|p|r)

On discrete distributions, exact cdf, pdf and p_value are

  <Distribution shortname>_(ecdf|epdf|ep)

Shortnames for distributions:

Roadmap

This gem wasn't updated for a long time before I started working on it, so there are a lot of work to do. The first priority is cleaning the interface and removing cruft whenever possible. After that, I want to implement more distributions and make sure that each one has a RNG.

Short-term

Medium-term

Long-term

Issues

For current issues see the issue tracker pages.

OMG! I want to help!

Everyone is welcome to help! Please, test these distributions with your own use cases and give a shout on the issue tracker if you find a problem or something is strange or hard to use. Documentation pull requests are totally welcome. More generally, any ideas or suggestions are welcome -- even by private e-mail.

If you want to provide a new distribution, run lib/distribution:

$ distribution --new your_distribution

This should create the main distribution file, the directory with Ruby and GSL engines and specs on the spec/ directory.