Stratified sampling never works for `honesty=True`

Describe the bug When using an UpliftTreeClassifier or UpliftRandomForestClassifier with honesty=True, stratified sampling always fails (because of an invalid call to train_test_split on this scikit-learn version at least).

To Reproduce

from causalml.inference.tree import UpliftTreeClassifier
import numpy as np

num_points = 1_000
X = np.random.randn(num_points, 10)
t = (np.random.rand(num_points) < .5).astype(int)
beta1 = np.random.randn(10)
beta2 = np.random.randn(10)
y1 = X @ beta1; y2 = X @ beta2
y = np.where(t == 0, y1, y2) > 0
model = UpliftTreeClassifier("0", evaluationFunction='CTS', honesty=True, )
model.fit(X, t.astype(str), y)

---> Stratified sampling failed. Falling back to random sampling.

from sklearn.model_selection import train_test_split
train_test_split(X, t.astype(int), y, stratify=[t.astype(int), y], shuffle=True)

Results in

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/ian.delbridge/.pyenv/versions/causalml-developer-py38/lib/python3.8/site-packages/sklearn/utils/_param_validation.py", line 214, in wrapper
    return func(*args, **kwargs)
  File "/Users/ian.delbridge/.pyenv/versions/causalml-developer-py38/lib/python3.8/site-packages/sklearn/model_selection/_split.py", line 2670, in train_test_split
    train, test = next(cv.split(X=arrays[0], y=stratify))
  File "/Users/ian.delbridge/.pyenv/versions/causalml-developer-py38/lib/python3.8/site-packages/sklearn/model_selection/_split.py", line 1745, in split
    X, y, groups = indexable(X, y, groups)
  File "/Users/ian.delbridge/.pyenv/versions/causalml-developer-py38/lib/python3.8/site-packages/sklearn/utils/validation.py", line 453, in indexable
    check_consistent_length(*result)
  File "/Users/ian.delbridge/.pyenv/versions/causalml-developer-py38/lib/python3.8/site-packages/sklearn/utils/validation.py", line 407, in check_consistent_length
    raise ValueError(
ValueError: Found input variables with inconsistent numbers of samples: [1000, 2]

Expected behavior Successful stratified sampling by treatment and outcome.

Screenshots If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

OS: macOS
Python Version: 3.8.17
scikit-learn=1.3.2 (from pip install ".[test]")

Additional context Add any other context about the problem here.

uber / causalml

Stratified sampling never works for `honesty=True` #755