fbdesignpro / sweetviz

Visualize and compare datasets, target values and associations, with one line of code.
MIT License
2.94k stars 277 forks source link

sweetviz shows wrong target rate for numerical variable #127

Closed shreeprasadbhat closed 11 months ago

shreeprasadbhat commented 1 year ago

I am trying to plot the distribution of a variable and target rate in each of its value, sweetviz shows wrong target rate. Below is the reproducible code.

import pandas as pd
import sweetviz as sv

var1 = [0.]*10 + [1.]*10 + [2]*10 + [3]*10
target = [0]*2 + [1]*8 + [0]*4 +[1]*6 + [0]*8 + [1]*2 + [0]*10
df = pd.DataFrame({'var1':var1, 'target':target})

fc = sv.FeatureConfig(force_num=['var1'])
report = sv.analyze([df, 'Train'], target_feat='target', feat_cfg=fc, pairwise_analysis='off')
report.show_html('report.html')
report.show_notebook('report.html')
image

I know that, if var1 is forcefully set to categorical, it shows the correct output. But it is not useful for me, since categorical variables sweetviz charts are not sorted based axis labels, but on the size of category.

image

How to make this work, by keep the variable numerical itself?

fbdesignpro commented 11 months ago

Fixed by 2ec0848b2ea29d1de179ce6206e99308feb46fa9!