firefly-cpp / NiaARM

A minimalistic framework for Numerical Association Rule Mining
MIT License
15 stars 6 forks source link

Bug Report: RuntimeWarning: invalid value encountered in scalar divide #112

Closed erkankarabulut closed 3 months ago

erkankarabulut commented 4 months ago

In rule.py, lines 224-226, and 244-246, the following lines throw an error if the divisor part is 0:

acc += (attribute.max_val - attribute.min_val) / (
                    feature_max - feature_min
                )

For instance, if all the values are same in a column, that this lines throws the following warning message: RuntimeWarning: invalid value encountered in scalar divide acc += (attribute.max_val - attribute.min_val) / (...

firefly-cpp commented 3 months ago

@erkankarabulut, thank you for reporting this bug. Please provide a minimal example that resulted in an exception. We would like to have more information before changing the code.

erkankarabulut commented 3 months ago

Hello @firefly-cpp! Here is a minimal example that will result in the warning message I mentioned:

import pandas as pd
from niaarm import get_rules, Dataset
from niapy.algorithms.basic import HarrisHawksOptimization

data = [["feature1", "feature2", "feature3"], [10, 20, 30], [10, 20, 30], [10, 20, 30]]
algo = HarrisHawksOptimization(population_size)
metrics = ['confidence']

frame = pd.DataFrame(data[1:], columns=data[0])
dataset = Dataset(frame)
rules, run_time = get_rules(dataset, algo, metrics, max_iters, logging=False)

In this case, feature_max and feature_min will be equal and the divisor part on the lines I mentioned will be equal to 0.

I see that acc is used in amplitude calculation. I would fix it myself and open a pull request, but I was not sure how would the amplitude be calculated in this case.

firefly-cpp commented 3 months ago

Thanks @erkankarabulut. Please patch it and submit a pull request.

erkankarabulut commented 3 months ago

@firefly-cpp I wasn't sure how to fix this, but I guess increasing acc by 1 during amplitude calculation instead of the division operation seemed correct.

Could you verify this, please? I opened a PR as well: https://github.com/firefly-cpp/NiaARM/pull/113. I am not sure how to link PRs to issues.

firefly-cpp commented 3 months ago

The new version is now on Pypi. @erkankarabulut, please check it out and test it. In case of any other bugs, feel free to open a ticket and submit a pull request.

By the way, there is also a Julia version: https://github.com/firefly-cpp/NiaARM.jl while there is also an R version, but is very immature: https://github.com/firefly-cpp/niarules