GeoscienceAustralia / uncover-ml

Machine Learning system for Geoscience Australia uncover project
Apache License 2.0
30 stars 20 forks source link

Random forest sample weights #95

Closed brenmous closed 4 years ago

brenmous commented 4 years ago

scikitlearn includes a sample_weight parameter for the fit method of decision tree based models. It would be good to incorporate this into the multi random forest algorithm.

I've attempted implementing this on https://github.com/GeoscienceAustralia/uncover-ml/tree/bren-rf-weighting.

The idea is to provide a 'weight' attribute with the target shapefile. If weighted is set to true in the algorithm args of the learning block in the config, these attributes will be passed as the sample_weight parameter (NDVs will be set to 1.0 automatically).

However I haven't been able to get it to have any impact on predicted values - they remain the same as an unweighted model.