Several of the example machine learning projects mentioned by our contact as GSFC use normalization steps before learning. Here are some of the use cases we should support:
z-score taking mean and std dev from config or from function in config that returns the right mean/ std dev for a given pixel
log: log base e
log10
anomaly_probs:
caclulate time-mean for each space grid cell
calculate the deviation from time-mean for each space grid cell
bin / round the deviation into increments of X
calculate log10 probability as log10(np.bincount(i)/ total_num_obs) where i is the binned anomaly and total_num_obs is the # of obs for a given space grid cell
further notes on anomaly_probs:
Allow anomaly_probs as described above, but within time windows, such as calendar months
Instead of finding the anomaly based on the time-mean for each space grid cell, compare to some other zonal stat, such as the mean of a polygon in which a pixel falls.
other methods from sklearn for scaling such as StandardScaler, MinMaxScaler, etc.
Several of the example machine learning projects mentioned by our contact as GSFC use normalization steps before learning. Here are some of the use cases we should support:
z-score
taking mean and std dev from config or from function in config that returns the right mean/ std dev for a given pixellog
: log base elog10
anomaly_probs
:log10(np.bincount(i)/ total_num_obs)
wherei
is the binned anomaly andtotal_num_obs
is the # of obs for a given space grid cellanomaly_probs
:anomaly_probs
as described above, but within time windows, such as calendar monthsStandardScaler
,MinMaxScaler
, etc.