eriklindernoren / ML-From-Scratch

Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.
MIT License
23.56k stars 4.55k forks source link

Linear Discriminant Analysis #88

Open realamirhe opened 3 years ago

realamirhe commented 3 years ago

First of all thanks for the great reference, you've been created and it performs well in its current format.

But, Is it acceptable to use covariance matrices instead of scatter matrices in LDA? shouldn't it use scatter matrices? https://github.com/eriklindernoren/ML-From-Scratch/blob/a2806c6732eee8d27762edd6d864e0c179d8e9e8/mlfromscratch/supervised_learning/linear_discriminant_analysis.py#L24-L25

As we know the relation between these two matrices is
scatter(X) = X.T.dot(X) covariance(X) = X.T.dot(X) / N for a given X or X = X - mean(X) and N = |X|

reference

realamirhe commented 3 years ago

in LDA, we can refactor the predict method by NumPy built-ins to a more readable and performant version https://github.com/eriklindernoren/ML-From-Scratch/blob/a2806c6732eee8d27762edd6d864e0c179d8e9e8/mlfromscratch/supervised_learning/linear_discriminant_analysis.py#L37-L43

is equal to this

def predict(self, X):
    return np.array([1 * (x.dot(self._w) < 0) for x in X], dtype=np.int)

which can be implemented like this

def predict(self, X):
    return np.where(X.dot(self._w) < 0, 1, 0)