Hoosier-Clusters / clusim

An extended package for clustering similarity
MIT License
63 stars 15 forks source link

Fixing inconsistencies in sim.py #33

Closed jg-you closed 4 years ago

jg-you commented 4 years ago

Follow-up on issue #28.

This PR does not change any functional part of the code. Instead, it

  1. Updates the available_similarity_measures list in sim.py
  2. Fixes a lot of typos in the docstrings.
  3. Swaps the place of some functions in the information theory section, so that they are now naturally organized.
  4. Adds docstrings to functions that have none
  5. Adds missing :params: statements for some functions.

In doing this I've found two "bugs" but did not change them to ensure the PR has no side-effects.

  1. nmi is calculated in base e when called with the choice of normalization none (i.e., the MI). There is no way to change the base directly in the function. This could be fixed by adding a logbase argument to nmi which is only used with no normalization.
  2. Likewise vi is also in base e by default with no option to change the behaviour in the function.
yy commented 4 years ago

Thank you so much! Looks great!

jg-you commented 4 years ago

:wave: Checking on the PR