Closed BukovnikMiha closed 4 months ago
@zStupan, please review this submission. From the initial prescreening, I believe several additions are not applicable.
@zStupan, what about datasets and licenses?
@firefly-cpp oh right. The football players one is taken from wikipedia so it's likely under a creative commons license, so we have to include the link to the original wikipedia article.
The weather data is synthetically generated and is under the CC0- public domain license, so there aren't any requirements there.
The dev salaries one is under Apache 2.0, which I think means we have to include a copy of the Apache 2.0 license with the dataset
Thank you for your review.
I have made the following updates based on your feedback:
Introduces new visualization techniques for association rule mining:
This addition enhances data analysis and will help identify patterns, relationships and distributions in datasets more effectively.
This addition includes new functions in the
niaarm/visualize.py
file:scatter_plot()
grouped_matrix_plot()
Both of these functions receive parameters:
rules
(mined rules to visualize),metrics
(metrics to display like lift, support, confidence, etc.) andinteractive
(boolean indicating if the visualization should have interactive features like zooming, hover data, etc.). Thegrouped_matrix_plot
also receives parameterk
for number of clusters to display.It also includes 3 datasets for testing new visualizations: weather_data.csv, football_players.csv and data_developer_salary.csv . These can be found the the
datasets
folder. This is accompanied with dataset preparation in theexamples/visualization_examples/prepare_datasets.py
file, which applies preprocessing techniques to these datasets, such as removing duplicate rows, missing values, discretizing data, selecting relevant columns, etc. . Also in theexamples/visualization_examples
folder are two seperate folders for each visualization technique for displaying these datasets.Example usage of the new visualization functions:
Also adds new dependencies:
plotly >= 5.22.0
scikit-learn >= 1.5.0