rasbt / mlxtend

A library of extension and helper modules for Python's data analysis and machine learning libraries.
https://rasbt.github.io/mlxtend/
Other
4.85k stars 857 forks source link

Leverage returns values greater than 1 #883

Open adgrig opened 2 years ago

adgrig commented 2 years ago

While running the apriori algo for a project, Leverage returned (some) values > 1; I'm not doing anything exotic with the library and am working from a rather standard sparse matrix. Shouldn't Leverage take values within [-1,1]?

Happy to provide more context if needed.

rasbt commented 2 years ago

Oh yeah, a leverage larger than 1 would we weird, I think, because it is computed as

support(A->C) - support(A)*support(C)

and support(A->C) can't be greater than 1.

(code: https://github.com/rasbt/mlxtend/blob/5a14e3781ea0b0318a303040a03b0c66c438fbbe/mlxtend/frequent_patterns/association_rules.py#L110)

If you have an example that can reproduce this, or maybe the support columns of the dataset in that row, that'd be useful for looking into this further

adgrig commented 2 years ago

Haha, thought as much. What struck me as weird was that even if my dataset were corrupted somehow, this still shouldn't be able to return values greater than 1 considering its formula.

I'll ask if I can share the data with you (or a part of it/ or columns with labels obscured etc) and get back.

-- Edit: I can share some sample data privately with you. Let me know how to proceed here :).

zuari1993 commented 1 year ago

@adgrig can you share data on this. I can work to fix this. If the data can't be publicly shares, please private message me.

adgrig commented 1 year ago

Hi, I unfortunately no longer have access to that dataset.