Closed translunar closed 9 years ago
I went through the remaining distributions available in gsl not yet implemented in distributions and counted their arxiv.org mentions, including synonymous names where possible. This is ranked based on number of mentions. Some distributions are listed multiple times when using synonymous names yielded many many more results, particularly when synonymous names are generic terms (ex: uniform distribution).
Arxiv Mentions | Names | Other Names | Notes |
---|---|---|---|
590 | Flat (Uniform) Distribution | Including 'Uniform Distribution' | |
370 | Lognormal Distribution | Galton Distribution | |
174 | Laplace Distribution | Gumbel Distribution, Double Exponential Distribution | |
125 | Levy alpha-Stable Distributions | Stable Distribution | Including 'Stable Distribution' |
85 | Levy skew alpha-Stable Distribution | Levy Distribution, Van der Waals profile | Including 'Levy Distribution' |
74 | Geometric Distribution | ||
66 | Pareto Distribution | Bradford Distribution | |
64 | Negative Binomial Distribution | ||
63 | Flat (Uniform) Distribution | Not Including 'Uniform Distribution' | |
61 | Weibull Distribution | ||
47 | Dirichlet Distribution | ||
39 | Cauchy Distribution | Lorentz Distribution, Breit–Wigner Distribution | |
31 | Bernoulli Distribution | ||
31 | Multinomial Distribution | ||
26 | Rayleigh Distribution | ||
18 | Logarithmic Distribution | logarithmic series distribution, log-series distribution | |
7 | Exponential Power Distribution | Generalized Gaussian Distribution, Generalized Normal Distribution | |
6 | Landau Distribution | ||
4 | Logistic Distribution | ||
2 | Gaussian Tail Distribution | ||
2 | General Discrete Distributions | ||
2 | Pascal Distribution | ||
2 | Levy alpha-Stable Distributions | Stable Distribution | Not Including 'Stable Distribution' |
1 | Levy skew alpha-Stable Distribution | Levy Distribution, Van der Waals profile | Not Including 'Levy Distribution' |
0 | Rayleigh Tail Distribution | ||
0 | Spherical Vector Distributions | ||
0 | Type-1 Gumbel Distribution | ||
0 | Type-2 Gumbel Distribution |
Is there need for anything more than density function, distribution function, characteristic function, some params for every distribution and maybe plot? If not, it shouldnt be too hard to implement everything from scratch.
The Stan project: http://mc-stan.org/ has a BSD-3-clause license and has a certain number of built-in probability distributions which are on the list, with expressions given explicitly in the documentation: http://stan.googlecode.com/files/stan-reference-1.0.2.pdf.
I guess that the argument of the license compatibility holds only for those distribution with no explicit density function, and thus for which some particular algorithm is needed. Otherwise, it seems difficult to believe that the GPL of GSL covers also the mathematical expression describing the distribution.
Ahh, thanks. This is helpful.
Unfortunately, licenses cover the approximations used for various functions. You nearly always need to use an approximation.
Categorical distribution is missing from the list. It's not such an easy problem to just omit.
related with this theme, I started to write something with jruby and commons, you can find it here https://github.com/vpereira/distribution/tree/jruby_support (warning, probably it isn't working, but you can read the code and see what I'm trying to do :)). I do support as well GSL, but I'm willing to remove the whole MRI support and just work on top of jruby. There are some really good java library and wiith the license that we need jscience (BSD) and Commons (apache). Beside it the GSL ruby support isn't complete and well, GPL isn't the way to go.
Claudio Bustos' distribution gem supplies the probability distributions for SciRuby.
Some of these have already been implemented (e.g., normal, chisquare, hypergeometric, logistic, F, exponential, binomial, bivariate normal, Poisson, Student's t, beta, gamma).
Others have not. For example, multivariate normal and lognormal are both needed.
See a list of already-implemented distributions. Make sure to look at existing distributions for a template. The goal is to eventually implement each in Java, pure Ruby, and C (i.e., GSL or statistics2).
One difficulty is license compatibility. If code is GPLed, it cannot go directly into SciRuby. Claudio's
distribution
gem is currently under the GPL, mainly because some of the distributions are derived from GSL code (which is itself GPL). It would be best to rewrite those distributions (eventually) based on academic papers or other material that isn't subject to the GPL, because we want to be moving toward BSD/MIT compatibility.This goes for new distributions as well. If you can only get code from GSL, see if you can reach out to the original author of the code in question. Find out how he or she would feel about us incorporating it into SciRuby under a more liberal license. Please document any conversations you may have, particularly if you're able to reach an author and he or she gives permission.
One idea for finding a list of common probability distributions: seach arxiv.org for usage of the names of distributions, like so:
https://www.google.com/search?q=site%3Aarxiv.org+%22gamma+distribution%22
Gamma distribution is found 2500 times, but Poisson distribution is found over 10,000 times. You could use this to get an idea which distributions are most utilized.