UBC-Stat-ML / bayonet

Probabilistic inference utils
BSD 2-Clause "Simplified" License
4 stars 4 forks source link

Implementing univariate distributions (and the tests) in the exponential family #7

Closed junseonghwan closed 9 years ago

junseonghwan commented 10 years ago

Hi all,

I will start working on implementing some of the common univariate distributions missing in the bayonet.distributions package and implementing tests for them. I plan to focus on implementing the distributions in the exponential family (for this issue anyways).

I will also implement the tests for the existing ones (for e.g. Gamma).

Please let me know if any of you are interested in sharing the duties.

sohrabsa commented 10 years ago

I'm sorry, I closed it by mistake.

jewellsean commented 10 years ago

I will help. Please feel free to delegate a part of the list to me.

sohrabsa commented 10 years ago

Following our discussion, I will focus on continuous distributions.

jewellsean commented 10 years ago

I will do the gamma test and negative binomial distributions + tests since I need them for a point processes application within the next day or so.

junseonghwan commented 10 years ago

I initially planned to start with discrete distribution (Poisson and NB) but since Sean claimed it already, I will start by implementing Dirichlet distribution.

I was thinking about how to best do the testing. I think it's probably a good idea for the developer of the distribution to do his own testing of his code and have a third person to do another test to make sure that it's correct. So for example, I would test my own code for Dirichlet distribution and one of you guys would also test it once I'm done. Similarly, Sean will implement NB and either Sohrab or I will test his NB code.

As for co-ordination, I think Sohrab mentioned to me once there is a free software (web-based?) for managing projects? Perhaps we can use something like that?

jewellsean commented 10 years ago

Good ideas, Seong. Another tool that I think we should consider for generating from distributions not already implemented in Apache is:

http://www.iro.umontreal.ca/~simardr/ssj/doc/html/umontreal/iro/lecuyer/randvar/package-summary.html

This package was implemented under supervision of Pierre L'Ecuyer. I plan to use this for the NegativeBinomial distribution for a real number of successes (vs. classical definition which is the PascalDistribution).

junseonghwan commented 10 years ago

Sean, a quick question. Have you already implemented Dirichlet distribution? I realized that you may have already implemented Dirichlet since you may need it for the project that you are working on.

jewellsean commented 10 years ago

No, I have not yet implemented it. It is near the top of my list, but below NB

alexandrebouchard commented 10 years ago

Another good source: package distribution in beast:

https://code.google.com/p/beast2/source/browse/#svn%2Ftrunk%2Fsrc%2Fbeast%2Fmath%2Fdistributions

alexandrebouchard commented 10 years ago

See also:

http://darrenjw.wordpress.com/2011/06/04/java-math-libraries-and-monte-carlo-simulation-codes/

(but things probably have change since then)

jewellsean commented 10 years ago

Seong -- I am not sure how far along you are in implementing the Dirichlet, but in my own implementation of the NB I found that I needed to create various interfaces (IntergerVariable, IntergerUnivariateDistribution etc.) which will be useful for you.

I am writing the test code now, and I wanted to poll opinions on adding a sampler for integer valued variables. Are there any common/recommended ones which we should implement first? I can think of many naive MH steps, but wanted to see if there was anything off the shelf first.

junseonghwan commented 10 years ago

Sean,

I also had to write bunch of other things along the way. For one, Dirichlet is a multivariate distribution and not a univariate :)

I am actually connecting the realization of Dirichlet as Multinomial and I am currently in the process of writing a sampler for the multinomial parameters, which are in [0, 1].

I have not yet come across any integer valued variables yet so I haven't given it any thought... But I will probably need something for it when I get working on Multinomial. I'd say for the time being, you should do whatever you feel is the easiest to get things to work for the Negative Binomial.

jewellsean commented 10 years ago

Ok, sounds good, thanks. I will implement something simple for integer moves. Yesterday, when I was thinking about implementing Dirichlet for another problem, the constrained proposals I was thinking about were (i) truncated stick breaking or (ii) Dirichlet itself. (ii) is probably easier to implement (esp. if you use symmetry properties)

And, yes, of course Dirichlet is multivariate...I think I was thinking about Poisson :)

junseonghwan commented 10 years ago

I have committed implementation of Multinomial and Dirichlet. I created new data types, IntegerValuedVector, RealVector, and ProbabilitySimplex and corresponding MH samplers for these data types (I kept the samplers as simple as possible).

But I have not been able to complete the testing of the code yet because I have not yet implemented the processors for these. I have yet to check if the existing tests can be applied to the multivariate case; if not, we will have to write new tests. I am departing for Toronto tomorrow but I will work on it as I find time while I'm there. In the meantime, if anyone wants to take a stab at it, please feel free.

alexandrebouchard commented 10 years ago

It seems that the test TestDirichletMultinomial is still failing. Not a big deal, but in the future it might be better to commit experimental code in a branch other than master. As an immediate fix, I suggest to comment out the problematic @Test so that we can still run gradle without having to use the -x test switch.

junseonghwan commented 10 years ago

I have now completed the implementation of Dirichlet and Multinomial. The tests are in TestDirichletMultinomial and they pass. I will complete some of the other univariate discrete distributions, Binomial, Poisson, and etc for completeness (should be faster now as I am starting to get a handle on this).

jewellsean commented 9 years ago

Negative binomial added in commit 660c86094400337a5a721df90ce6ebb256f64e02 Poisson to come shortly

jewellsean commented 9 years ago

Poisson added in commit c70dfade79ebb6fcd002688c97d57e827e57a59d