H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
I am using H2o package on Python, to apply DL on a simple balanced Dataset of 2 classes "0" and "1". It is divided into 2 files of 80 % data used for training and 20% for validation purpose.
The results are always biased to class "1", regardless tuning parameters, due to the confusion metric.
Also, I notice the prediction results P0 and P1 are not biased to a certain class if I use threshold of 0.5, but the predict first column is using F1 measure.
How can I enhance the results of the confusion table?
Is there a way to make the predict value to apply another threshold and hence changing all the metrics results related?
Thanks in advance.
My code Snippet is as follows:
model = H2ODeepLearningEstimator(
activation="Tanh",
hidden=[50,50,50],
distribution="multinomial",
score_interval=10,
epochs=1000,
input_dropout_ratio=0.2
,adaptive_rate=True
, rho=0.98, epsilon = 1e-7
,seed=1
,reproducible=True
)
model.train(
x=x,
y=y,
training_frame =train,
validation_frame = test)
I am using H2o package on Python, to apply DL on a simple balanced Dataset of 2 classes "0" and "1". It is divided into 2 files of 80 % data used for training and 20% for validation purpose.
The results are always biased to class "1", regardless tuning parameters, due to the confusion metric.
Also, I notice the prediction results P0 and P1 are not biased to a certain class if I use threshold of 0.5, but the predict first column is using F1 measure.
How can I enhance the results of the confusion table? Is there a way to make the predict value to apply another threshold and hence changing all the metrics results related? Thanks in advance.
My code Snippet is as follows: model = H2ODeepLearningEstimator( activation="Tanh", hidden=[50,50,50], distribution="multinomial", score_interval=10, epochs=1000, input_dropout_ratio=0.2 ,adaptive_rate=True , rho=0.98, epsilon = 1e-7 ,seed=1 ,reproducible=True ) model.train( x=x, y=y, training_frame =train, validation_frame = test)