bayesmix-dev / bayesmix

Flexible Bayesian nonparametric mixture models in C++
https://bayesmix.rtfd.io
BSD 3-Clause "New" or "Revised" License
22 stars 18 forks source link

Clustering and credible balls feature #58

Closed ricardaxel closed 2 years ago

ricardaxel commented 3 years ago

Added clustering point estimate with loss function and greedy algorithm + computation of credible balls. All sources are in "clustering" directory.

mberaha commented 3 years ago

Thanks a lot for the effort guys! This looks really promising :)

The PR is huge and it's going to take some time to review, to make things easier on our side, please include in this PR only the files that are rightfully part of the C++ library, i.e. I would remove:

If you want, you could have a couple of simulations inside a single jupyter notebook file as we currently do in python/comparison.ipynb

Then I see that you have some 'tests' like src/clustering/lossfunction/test_loss.cpp It would be nice if you could transfer these tests into unit tests as we do in the test/ directory

Also, I would ask you to perform at least a further test: can you check that the Greedy algorithm produces a better clustering (lower loss) than our simple routine, when it is initialized with the output of our simple routine?

I will get back to you with some more review on the actual code once this is done!

brunoguindani commented 3 years ago

Thank you for your contribution. I second everything @mberaha said, and will post my own review once everything is ready. I would also recommend some small style changes: for instance in file names and folders, removing spaces and changing all uppercase letters to lowercase, as they may break compatibility with different platforms, or just create annoying maintentance headaches. Also, please rename .cpp files to .cc and .hpp to .h. There are other things, but they may be a bit of work (see CONTRIBUTORS.md), and the above are the most important ones as of now. I can always deal with them myself later, so don't worry about the rest too much.