mljar / mljar-supervised

Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation
https://mljar.com
MIT License
3k stars 401 forks source link

explain mode with metric_type=accuracy results seems abnormal? #550

Open Tonywhitemin opened 2 years ago

Tonywhitemin commented 2 years ago

Hi @pplonski I use mljar-AutoML to run a medical dataset (task mission:binary_classification). Mode selected=explain mode metric_type=accuracy The results seems abnormal as figure below... image All the metric_value were the same and didn't match the real value listed in each algorithm folder...(example showed as following) image

And I found that the metric_value:0.825506 seems inserted from the file of "learner_fold_0_training.log". image Could you help? Thanks!

pplonski commented 2 years ago

@Tonywhitemin thanks for reporting. Could you provide data and code to reproduce? Is your dataset small?

Tonywhitemin commented 2 years ago

Hi @pplonski The code and dataset provided as below zip file for you reference! Thanks for your help! BTW, the dataset is quite small. double_check.zip

Tonywhitemin commented 2 years ago

Hi @pplonski, do you reproduce the same result successfully? Please let me know if any, thank you!

pplonski commented 2 years ago

@Tonywhitemin sorry, dont have time ... Maybe you can try to debug the problem?

Tonywhitemin commented 2 years ago

Hi @pplonski it's ok, just follow your plan! I think my ability isn't good enough to solve this problem...

JeremyKeusters commented 1 year ago

Just wanted to bump this issue as I'm having exactly the same issue. I tried both version 0.11.3 as well as version 0.10.6. I do have to say that the differences are not that big as in the example above, but there is one, and it only appears when using eval_metric='accuracy'.

eg. Leaderboard CSV file:

Screenshot 2022-11-18 at 17 56 35

Neural Network README:

## Metric details
|           |    score |     threshold |
|:----------|---------:|--------------:|
| logloss   | 0.393685 | nan           |
| auc       | 0.885321 | nan           |
| f1        | 0.797203 |   0.346923    |
| accuracy  | 0.853933 |   0.573101    |
| precision | 1        |   0.992364    |
| recall    | 1        |   0.000517517 |
| mcc       | 0.690734 |   0.573101    |

I have created a very minimal notebook demonstrating the issue: mljar-supervised-bug-550.zip

pplonski commented 1 year ago

@JeremyKeusters thanks for reporting and providing the code+data to reproduce the issue!

Would you like to look into the code and fix the bug?