Simple DT dos not match with ScikitLearn

After getting huge difference between YDF CART and ScikitLearn DT I did a simple test to reproduce it.

Not that very simple synthetic dataset with a single informative value make the two trees matching perfectly. But when the number of informative values increase differences appears.

Code

import ydf
import pandas as pd
from sklearn.datasets import make_classification
from sklearn.tree import DecisionTreeClassifier
import matplotlib.pyplot as plt

n_features = 10

X,y=make_classification(n_samples=2000,n_features=n_features, n_redundant=0, n_informative=10, n_clusters_per_class=1,random_state=26)
plt.scatter(X[:, 0], X[:, 1], marker="o", c=y, s=25, edgecolor="k")

columns = []
for index in range(n_features):
    columns.append(f'X{index}')

df_train = pd.DataFrame(X,columns=columns)
df_train['label'] = y

model = DecisionTreeClassifier(criterion='entropy',max_depth=2)
model.fit(X,y)
plt.figure(figsize=(15,7))
sklearn.tree.plot_tree(model,label='root',impurity=True,rounded=True,filled=True,class_names=['Down','Up'],proportion=False);

# then observe the plot

model = ydf.CartLearner(label="label", min_examples=1, max_depth=3, validation_ratio=0.0,
                        task=ydf.Task.CLASSIFICATION).train(df_train)

# then observe the tree structure

In order to get the same real depth, max_depth = 2 for scikit learn must be set to 3 for YDF. And validation ratio to 0.0 in YDF avoid having different dataset. Scikit-learn is set to entropy as YDF uses this metric internaly.

Here is the Scikit-Learn tree plot

Here is the YDF tree plot (on Linux)

Here is the YDF tree plot (on MacOSX - ARM)

If we reduce the dataset complexity, the trees can become the same, but it's not always the case except for very small toys dataset.

Why this difference ? I didn't found any way to make it match. Is there a difference in the entropy computing between scikit learn and YDF ?

google / yggdrasil-decision-forests

Simple DT dos not match with ScikitLearn #103