h2oai / h2o-tutorials

Tutorials and training material for the H2O Machine Learning Platform
http://h2o.ai
1.48k stars 1.01k forks source link

Unable to use MapeMetric from CustomMetricFuncRegression #164

Open patriquintanilla opened 1 year ago

patriquintanilla commented 1 year ago

Hi!

I am trying to use MAPE as custom_metric_func when training a random forest model. I am following the code you developed in https://github.com/h2oai/h2o-tutorials/blob/master/tutorials/custom_metric_func/CustomMetricFuncRegression.ipynb but I am getting the following error: "OSError: Job with key $03010a8e002232d4ffffffff$_867caadde7f5a2c1a2f64a9a2d24453e failed with an exception: ImportError: No module named mape". I am working on a Dataproc cluster in Google cloud in a Zeppelin notebook and this is exactly the code I am running:

%pyspark
hc = pysparkling.H2OContext.getOrCreate()

# Define MAPE as a class to be used as h2o metric
class MapeMetric:
    def map(self, predicted, actual, weight, offset, model):
        return [weight * abs((actual[0] - predicted[0]) / actual[0]), weight]

    def reduce(self, left, right):
        return [left[0] + right[0], left[1] + right[1]]

    def metric(self, last):
        return last[0] / last[1]

# Upload the new metric to the h2o cluster
mape_func = h2o.upload_custom_metric(MapeMetric, func_name = "mape", func_file = "mape.py")

# Build and train the model:
rf_mape = H2ORandomForestEstimator(ntrees=100, custom_metric_func = mape_func)

rf_mape.train(x=predictors,
               y=target_value,
               training_frame=h2o_train_df
               )

Thanks in advance for your help!

patriquintanilla commented 1 year ago

Update: setting custom_metric_func =“mape” instead of custom_metric_func = mape_func changed the error to "Job with key $03010a8e002232d4ffffffff$_baede341b95a2949d86f712d779b4d29 failed with an exception: java.lang.ArrayIndexOutOfBoundsException: 1"