In your example, you do the iv filter, woe binning for the whole data before splitting train, test. If I understand correctly. this action causes data leakage because the trainset created after splitting will contain woe data of the whole dataset.
You are right. You can split your real dataset at the very beginning of model training. In this example, the binning result might be unstable when based on such a small train dataset only.
In your example, you do the iv filter, woe binning for the whole data before splitting train, test. If I understand correctly. this action causes data leakage because the trainset created after splitting will contain woe data of the whole dataset.