UdayLab / PAMI

PAMI is a Python library containing 100+ algorithms to discover useful patterns in various databases across multiple computing platforms. (Active)
https://udaylab.github.io/PAMI/
GNU General Public License v3.0
246 stars 197 forks source link

Association Rules as method instead of subclass #386

Closed CharlieAprog closed 5 months ago

CharlieAprog commented 5 months ago

Hey there, nice stuff so far. I am a bit confused as to why there is no (convenient/clear) option to acquire the assotiation rules and rank then according to lift after running a basic frequent patern mining algorithm. Instead its hidden inside a separate class that spceifically creates association rules rather than it being an extension to any algorithm run. Could you consider adapting this to become a general method of the basic pattern miners?

udayRage commented 5 months ago

Dear Charlie,

Good day!

You have asked an interesting question. The answer is known only to people who have been working and pursuing research since the early days of pattern mining. Please read this email.

1) In AI, generating first-order logic (or association) rules from the data has been challenging. 2) Agarwal effectively addressed the association rule generation problem through a two-step process: Step 1: finding frequent patterns from the data, which is a computationally expensive step even Apriori property exists Step 2: Generating association rules from frequent patterns, which is a simple process.

3) Later, these two steps were separated and studied independently (2000 to 2010): a) For step 1, researchers worked on developing faster algorithms, ECLAT and FP-growth, to discover frequent patterns efficiently.

 b) For step 2, researchers investigated alternative interestingness measures, such as kulk and lift, to discover interesting association rules.

4) The motivation behind this two-step is as follows:

That is why our code implements rule mining and pattern mining separately. Our code and the pattern mining libraries, such as SPMF and WEKA (which are Java-based), also implemented similarly.

I hope I have answered your question.


Researchers (including Agarwal) released that frequent patterns were more interesting than association rules in many applications. Thus, pattern mining has emerged with several models, such as correlated pattern mining, sequence pattern mining, and fault-tolerant patterns. Meanwhile, the research on association rule mining ended due to its simplicity.

CharlieAprog commented 5 months ago

Hey there thanks for the great information and quick response!

I agree with you that the two should be kept separate, of course. The reason why I was asking was because moving from the currently formatted results in pami to association rule mining is very strange. It requires a path to a dataset, rather than accepting the results directly from a frequent pattern algorithm. Besides research being more interested in frequent patterns, allowing for a more accessible route to life values allows for a simple and quick way to not only obtain frequent patterns, but also evaluate their usefulness and validity somewhat. My suggestion to add a method was not to include it in each mine() operation, but to allow easy access to rules, once mining has been done, if wanted.

Thanks again for the email Charlie Edit: removed email RE: and typos

udayRage commented 5 months ago

Q) It required a path to a dataset, rather than accepting the results directly from a frequent pattern algorithm. Ans) No need. First, you can generate frequent patterns and store them in a dataframe using obj.getPatternsAsDataFrame() Second, just pass the frequent pattern data frame into a rule mining algorithm to generate rules.

You don't need to save the patterns in a file and read again from a file, which is a time consuming and computationally expensive operation.

I will ask my students tomorrow to check. We will check the code tomorrow and update you in the comments.

udayRage commented 5 months ago

Dear Charlie,

Good day!

Thank you very much for your valuable feedback.

   We have incorporated your suggestion and modified the Association rule mining code to accept the data frame as an input and output association rules.

Please check the below provided jupyter notebook. 

Google Collab

udayRage commented 5 months ago

Dear Charlie,

Good day!

Thank you very much for your valuable feedback.

   We have incorporated your suggestion and modified the Association rule mining code to accept the data frame as an input and output association rules.

Please check the below provided jupyter notebook. 

Google Collab

udayRage commented 5 months ago

We will now close this issue.

IF you find any further issues, please feel free to inform us, and we will be glad to incorporate them.