parrt / dtreeviz

A python library for decision tree visualization and model interpretation.
MIT License
2.96k stars 333 forks source link

change bins in classifier nodes #326

Open masterfelu opened 2 months ago

masterfelu commented 2 months ago

Hello.

Thanks a lot for making this public! It is a wonderful repo. I was wondering if it is possible to change the number of bins for a classifier node when using viz_model.view(). It would look nice with histtype step for large datasets.

Thanks a lot again!

tlapusan commented 2 months ago

Hi @masterfelu, thanks for the nice words :)

Could you give us some screenshots with your situation? I'm in vacation right now, from what I remember, the number of bins is dinamicaly generated. Have to look in the source code

masterfelu commented 2 months ago

Sure! Thanks for taking a look at it during vacation.

image

We can see that for some nodes; it might look better if there were more bins, for example, the last row's last node. Bins in the range 50 to 100 would look fine in my case since there are more than 100k data points. It is just a matter of convenience since I might be inspecting multiple trees by eyeballing.

I took a look at your source, and it seems to be hard-coded as NUM_BINS. It seems to be dynamic, but the limit is 20.

What I imagined a node would look like is the one below(ignore the texts in the plot).

image

It is of histtype step and the bins are 50.

This is mostly my fault since I am not limiting my feature ranges properly. It is nice to know it can be improved by just using your tool!

Thanks for your time again!