david-cortes / outliertree

(Python, R, C++) Explainable outlier/anomaly detection through decision tree conditioning
http://outliertree.readthedocs.io
GNU General Public License v3.0
56 stars 4 forks source link

Another Ruby Library #1

Closed ankane closed 4 years ago

ankane commented 4 years ago

Hi David,

I created a Ruby library for OutlierTree as well. The output currently matches the Python library on the Titanic dataset :tada:

One small suggestion: it'd be nice if releases were tagged on GitHub to make it easier to reference specific versions and receive GitHub notifications when new releases are out.

Also, if you decide to support a C API for cmfrec in the future, I may try to use it for Disco (it currently uses LIBMF), since it'd be great to support side information.

david-cortes commented 4 years ago

Great, thanks! I'll put it in the readme too.

I've used release tags in github in the past, but I've found them to be too much of an annoyance and wouldn't want to use them anymore. I think libraries.io has a functionality to receive email notifications on new versions of packages, but I'm not 100% sure about that.

I do plan to document the C API for cmfrec and put the external-facing functions in a different header some time within the next 2 months, but let me warn you that wrapping that is a huge piece of work as it has really lots of functions (see wrapper_untyped.pxi for an idea), and those functions double down in usage by swapping their inputs (e.g. swapping users and items).

In the meantime, if you don't need all the functionality, you can just use the (undocumented) functions named fit_* for producing user/item/attribute factors. The rest of the external-facing functions are named <model>_factors_* plus predict_multiple and topN.

ankane commented 4 years ago

Thanks! Re releases: libraries.io works for the notification aspect. Re cmfrec: There really are a lot of functions. Will dig into it more when I have the chance.

david-cortes commented 3 years ago

@ankane If you’re still interested, the cmfrec package now has a full C API with documentation. There’s lots of functions there, most of which might not be relevant for Disco, so if you do decide to include it there, I’d suggest you to look specifically at the following functions:

fit_collective_explicit_als
fit_collective_implicit_als
factors_collective_explicit_single
factors_collective_implicit_single
topN_old_collective_explicit
topN_old_collective_implicit
topN_new_collective_explicit
topN_new_collective_implicit

Perhaps also these if you want a different recommendation approach:

fit_content_based_lbfgs
factors_content_based_single
topN_new_content_based
ankane commented 3 years ago

That's great news, thanks @david-cortes!