equinor / graphite-maps

Graph informed triangular ensemble-to-posterior maps
GNU General Public License v3.0
1 stars 0 forks source link

Scale data before precision estimation #39

Open Blunde1 opened 8 months ago

Blunde1 commented 8 months ago

There can be a large difference between

Prec_u_sub = fit_precision_cholesky(X, graph_u_sub, verbose_level=5)

and

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()

# Fit the scaler to the data and transform it
X_scaled = scaler.fit_transform(X)
X_scaled.shape
Prec_u_sub = fit_precision_cholesky(X_scaled, graph_u_sub, verbose_level=5)

In particular, this is experienced on e.g. TOP_VOLANTIS on Drogon. So it is a real issue. The differences are in numerical stability for the optimization, and consequently large timing differences.

Remedy: scale the data with the StandardScaler and then rescale either data or precision appropriately. Likely: