olivertomic / hoggorm

Explorative multivariate statistics in Python
BSD 2-Clause "Simplified" License
81 stars 25 forks source link

Zero-division error not occuring, leading to the nipalsPCA never converging and running untill stopped #51

Open martinbo94 opened 2 years ago

martinbo94 commented 2 years ago

After some debugging I found out that a feature in my dataset contained only zeros, which lead to errors in the standardisation process of the nipalsPCA when trying to divide by a std of 0. Perhaps there should be some handling of zero-values or at least a check if a column contains only zero values or other constants in-which the standard deviation is 0 and then throw an error. In my case, the code just ran until manually stopped instead of throwing a zero-division error.

When Xstand = True, the code runs untill manually stopped and gives the warning RuntimeWarning: invalid value encountered in true_divide self.arrX = (self.arrX_input - self.Xmeans) / self.Xstd

When Xstand = False, the code runs untill manually stopped and gives the warning RuntimeWarning: invalid value encountered in true_divide p = num / denom

olivertomic commented 2 years ago

Thank you for reporting this issue, @martinbo94. I agree, there should be a more specific feedback that standardisation will not work if the data has one or more columns with zero variance. We'll try to update this as soon as we can.