idaholab / raven

RAVEN is a flexible and multi-purpose probabilistic risk analysis, validation and uncertainty quantification, parameter optimization, model reduction and data knowledge-discovering framework.
https://raven.inl.gov/
Apache License 2.0
220 stars 133 forks source link

Covariance data handled incorrectly in unSupervisedLearning #857

Open joshua-cogliati-inl opened 6 years ago

joshua-cogliati-inl commented 6 years ago

Issue Description

What did you expect to see happen?

That cov(x,y) == cov(y,x), but instead they are very different numbers.

What did you see instead?

cov(x,y) and cov(y,x), are very different numbers.

Do you have a suggested fix for the development team?
Please attach the input file(s) that generate this error. The simpler the input, the faster we can find the issue.

Note that you have to add a print statement to see both covariance's: framework/PostProcessors/TemporalDataMiningPostProcessor/Clustering.GaussianMixture framework/PostProcessors/DataMiningPostProcessor/Clustering.GaussianMixture


For Change Control Board: Issue Review

This review should occur before any development is performed as a response to this issue.


For Change Control Board: Issue Closure

This review should occur when the issue is imminently going to be closed.

joshua-cogliati-inl commented 3 years ago

Note that this was partially fixed for covariance type full, but not fixed for type diag. See XXX comment in framework/unSupervisedLearning.py

            #if covariance type == full, the shape is (n_components, n_features,
 n_features)
            if len(covariance.shape) == 3:
              covariance[:,row,col] = covariance[:,row,col] * rowSigma * colSigm
a
            else:
              #XXX if covariance type == diag, this will be wrong.
              covariance[row,col] = covariance[row,col] * rowSigma * colSigma