ShichenXie / scorecard

Scorecard Development in R, 评分卡
http://shichen.name/scorecard
Other
160 stars 63 forks source link

Monotonic-WOE-Binning-Algorithm #7

Open Leo-Lee15 opened 6 years ago

Leo-Lee15 commented 6 years ago

Hello,

I just discover a Github repo, jstephenj14/Monotonic-WOE-Binning-Algorithm, which provides a Python implementation of a variable binning algorithm that optimizes information value (IV) monotonicity and representativeness.

I think it would be great to include this algorithm is your fantastic package scorecard. Since the author provides the Python version, I wonder if it could be incorporated into you scorecard R package.

Thanks!

ShichenXie commented 6 years ago

Thank you for your suggestion. I will read the repo and the referenced article. If it is reasonable, I will add it into the package. This might take some time.

According to my experience, some variables wouldn't be monotonic after woe binning. For example, the default rate at different hours in a day, always peak at midnight and afternoon.

Leo-Lee15 commented 6 years ago

Yes, it is too difficult to get a monotonic result for some variables. But at least, this algorithm provides a way to achieve the desired results less troublesome.

Anyway, thanks for your effort to this nice package!

monicamn commented 5 years ago

I use 'woebin' to bin the variables with my data. The error 'you are trying to merge an object and float64 columns. If you wish to proceed you should use pd.concat' appeared. I compared my variable type with yours, there existed 'object' type in your data too. But why using your data are there no error and my data error? Do you have any suggestions for me?

ShichenXie commented 5 years ago

I use 'woebin' to bin the variables with my data. The error 'you are trying to merge an object and float64 columns. If you wish to proceed you should use pd.concat' appeared. I compared my variable type with yours, there existed 'object' type in your data too. But why using your data are there no error and my data error? Do you have any suggestions for me?

You are using python version package? Please open an issue in scorecardpy repo and provide a reproducible example.

monicamn commented 5 years ago

Thank you for your answer and i use python 3.7 to run the code. The error is as follows:

Traceback (most recent call last): File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2961, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "", line 4, in positive="bad|1", no_cores=None, print_step=1, method="tree") File "C:\ProgramData\Anaconda3\lib\site-packages\scorecardpy-0.1.7.1-py3.7.egg\scorecardpy\woebin.py", line 877, in woebin bins = dict(zip(xs, pool.starmap(woebin2, args))) File "C:\ProgramData\Anaconda3\lib\multiprocessing\pool.py", line 276, in starmap return self._map_async(func, iterable, starmapstar, chunksize).get() File "C:\ProgramData\Anaconda3\lib\multiprocessing\pool.py", line 657, in get raise self._value ValueError: You are trying to merge on object and float64 columns. If you wish to proceed you should use pd.concat

ShichenXie commented 5 years ago

你到scorecardpy那个项目新建一个issue吧。然后给一个可重现的例子,不然我没法知道你碰到了啥问题。

6yuan789 commented 5 years ago

hi,shichenxie, 在woebin.py中binning_tree变量没有初始化,有时会报错,加上“binning_tree = None”可以解决问题

default

ShichenXie commented 5 years ago

我看看,这个问题

wgx711 commented 5 years ago

Dear ShichenXie 我在运行woebin函数时弹出错误,提示没有"data.table"函数,但我后面library(data.table)后还是如此提示,不知道什么原因。如下图: image

ShichenXie commented 5 years ago

我在运行woebin函数时弹出错误,提示没有"data.table"函数,但我后面library(data.table)后还是如此提示,不知道什么原因。如下图:

重启一下R,再试试看。如果你在windows环境下,确认是否安装了rtools。

wgx711 commented 5 years ago

我在运行woebin函数时弹出错误,提示没有 “data.table” 函数,但我后面库(data.table)后还是如此提示,不知道什么原因如下图:

重启一下R,再试试看。如果你在视窗环境下,确认是否安装了rtools。

我想弱弱的问下,rtools是什么意思。 我确实是win10 64位环境,安装了64位的r和64位的rstuido,r是3.5.2版本。 我在woebin的帮助文件中,按照帮助文件,运行 bins2_tree = woebin(germancredit, y="creditability",x=c("credit.amount","housing"), method="tree")能正确运行,但运行bins_germ = woebin(germancredit, y = "creditability") 就会提示...没有"data.table"这个函数...

ShichenXie commented 5 years ago

我在运行woebin函数时弹出错误,提示没有 “data.table” 函数,但我后面库(data.table)后还是如此提示,不知道什么原因如下图:

重启一下R,再试试看。如果你在视窗环境下,确认是否安装了rtools。

我想弱弱的问下,rtools是什么意思。 我确实是win10 64位环境,安装了64位的r和64位的rstuido,r是3.5.2版本。 我在woebin的帮助文件中,按照帮助文件,运行 bins2_tree = woebin(germancredit, y="creditability",x=c("credit.amount","housing"), method="tree")能正确运行,但运行bins_germ = woebin(germancredit, y = "creditability") 就会提示...没有"data.table"这个函数...

你看看CRAN网站上的Download R for Windows,里面第四个就是

wgx711 commented 5 years ago

我在运行woebin函数时弹出错误,提示没有 “data.table” 函数,但我后面库(data.table)后还是如此提示,不知道什么原因如下图:

重启一下R,再试试看。如果你在视窗环境下,确认是否安装了rtools。

我想弱弱的问下,rtools是什么意思。 我确实是win10 64位环境,安装了64位的r和64位的rstuido,r是3.5.2版本。 我在woebin的帮助文件中,按照帮助文件,运行 bins2_tree = woebin(germancredit, y="creditability",x=c("credit.amount","housing"), method="tree")能正确运行,但运行bins_germ = woebin(germancredit, y = "creditability") 就会提示...没有"data.table"这个函数...

你看看CRAN网站上的Download R for Windows,里面第四个就是

谢谢。我去了解下,但目前的情况是bins2_tree = woebin(germancredit, y="creditability",x=c("credit.amount","housing"), method="tree")一直到bins_width = woebin(germancredit, y="creditability", x=numeric_cols, method="width") 都能正常运行,就是bins_germ = woebin(germancredit, y = "creditability") 运行弹出那个错误。。。

你建议我先卸载scorecard这个包,然后再从github装最新的吗?

wgx711 commented 5 years ago

我在运行woebin函数时弹出错误,提示没有 “data.table” 函数,但我后面库(data.table)后还是如此提示,不知道什么原因如下图:

重启一下R,再试试看。如果你在视窗环境下,确认是否安装了rtools。

我想弱弱的问下,rtools是什么意思。 我确实是win10 64位环境,安装了64位的r和64位的rstuido,r是3.5.2版本。 我在woebin的帮助文件中,按照帮助文件,运行 bins2_tree = woebin(germancredit, y="creditability",x=c("credit.amount","housing"), method="tree")能正确运行,但运行bins_germ = woebin(germancredit, y = "creditability") 就会提示...没有"data.table"这个函数...

你看看CRAN网站上的Download R for Windows,里面第四个就是

我在另一个电脑上,R的版本是3.4.2.可以正常运行所有的函数。。。也不晓得是什么原因

ShichenXie commented 5 years ago

我在另一个电脑上,R的版本是3.4.2.可以正常运行所有的函数。。。也不晓得是什么原因

  1. 如果你没安装过rtools,那就是这个原因,安装下就解决了
  2. 如果R是从3.5之前升级到目前的3.5.2,那么需要重新安装所有包 如果还没解决,我也没办法了
ShichenXie commented 5 years ago

后面的朋友别在这个issue里面提问题了啊。有问题重新开一个new issue。这个issue是因为一直还没解决所以没有关闭。

ddzr commented 5 years ago

This Github Repo by Wensui Liu also has some MonotonicBinning implementations in R.

longhua8800w commented 4 years ago

我也希望scorecard包加入单调分箱的功能作为 分bin的选项

shlid007 commented 1 year ago

If WOEBIN doesn't return monotonic bins, does that compromise the interpretability of the WOE/IV values? Is it up the user to rebin?

Blanket58 commented 7 months ago

单调分箱的功能什么时候能加入啊?

ShichenXie commented 7 months ago

If WOEBIN doesn't return monotonic bins, does that compromise the interpretability of the WOE/IV values? Is it up the user to rebin?

The woebin_adj function provides an interface to adjust the binning results manually.

ShichenXie commented 7 months ago

单调分箱的功能什么时候能加入啊?

等回头我再研究研究啊