tensorflow / probability

Probabilistic reasoning and statistical analysis in TensorFlow
https://www.tensorflow.org/probability/
Apache License 2.0
4.26k stars 1.1k forks source link

Poor fit with Linear Regression in tfp.sts #1023

Open amol447 opened 4 years ago

amol447 commented 4 years ago

Hi, I am trying to fit a time series model with exogenous variable. The linear regression part doesn't seem to be able to find weights correctly. Here is a simple example to reproduce the problem-

import numpy as np
import tensorflow_probability as tfp
import tensorflow as tf
from sklearn.preprocessing import StandardScaler
weights=np.random.randn(5)+np.arange(1,6)
design_matrix=np.random.normal(4,2,(200,5))
tseries = np.matmul(design_matrix,weights)
model=tfp.sts.Sum([tfp.sts.LinearRegression(design_matrix=design_matrix)],observed_time_series=tseries)
variational_posterior = tfp.sts.build_factored_surrogate_posterior(model)
optimizer = tf.optimizers.Adam(learning_rate=0.1)
tfp.vi.fit_surrogate_posterior(target_log_prob_fn=model.joint_log_prob(observed_time_series=tseries),
                                                     surrogate_posterior=variational_posterior,optimizer=optimizer,num_steps=200
                                                     )

samples = variational_posterior.sample(50)
fitted_weights=np.mean(samples['LinearRegression/_weights'], axis=0)
error_rms = np.sqrt(np.mean(np.power(weights-fitted_weights,2)))
error_rms
3.02

if I normalize the design_matrix, the numbers look much better

normalizer = StandardScaler()
normalized_matrix = normalizer.fit_transform(design_matrix)
model=tfp.sts.Sum([tfp.sts.LinearRegression(design_matrix=normalized_matrix)],observed_time_series=tseries)
variational_posterior = tfp.sts.build_factored_surrogate_posterior(model)
optimizer = tf.optimizers.Adam(learning_rate=0.1)
tfp.vi.fit_surrogate_posterior(target_log_prob_fn=model.joint_log_prob(observed_time_series=tseries),
                                                     surrogate_posterior=variational_posterior,optimizer=optimizer,num_steps=200
                                                     )

samples = variational_posterior.sample(50)
fitted_weights=np.mean(samples['LinearRegression/_weights'], axis=0)/normalizer.scale_
error_rms = np.sqrt(np.mean(np.power(weights-fitted_weights,2)))
error_rms
0.56
davmre commented 4 years ago

Is your variational optimization converging? (you can see this from the loss curve returned by fit_surrogate_posterior). If not, you may want to increase the number of steps.

Generally I would expect a normalized design matrix to lead to a better-conditioned optimization problem that would be faster to converge, all else being equal.

amol447 commented 4 years ago

Then I get into the issue I have over here (see the last comment). I normalize, the weights_prior and weights_constraining_bijector I would need to specify would be different for each batch and there seems to be no easy way to create this type of bijector.

I also checked the loss curve and the optimization seems to have stalled for unnormalized case so I don't think increasing num_steps will work. Here are last 50 loss_curve values for unnormalized case

<tf.Tensor: shape=(50,), dtype=float64, numpy= array([763.93081719, 766.10331749, 765.11964954, 764.46272873, 765.89419736, 770.15077838, 768.31471942, 768.91955323, 770.62193118, 765.33542602, 765.71370294, 769.89131876, 767.46364822, 768.86250098, 769.78790337, 767.90190939, 764.57111531, 766.99044293, 765.28328907, 761.63563885, 778.600896 , 775.45398601, 775.37556489, 764.92138231, 764.49788996, 762.41033604, 764.07139634, 769.58919769, 769.97874594, 766.57655588, 769.63247877, 768.3141973 , 765.39523154, 765.3549978 , 764.26272891, 765.89452207, 767.82033705, 764.80364979, 768.06425015, 770.65213456, 767.87662662, 771.56713973, 768.50769508, 764.37388141, 763.52981121, 760.54508523, 778.12263336, 786.72497995, 784.78095079, 763.13842348])>

amol447 commented 4 years ago

Just tried fitting with 500 steps and no improvement in the fit. error_rms 3.1