jellAIfish / jellyfish

This repository is inspired by Quinn Liu's repository Walnut.
4 stars 4 forks source link

Non negative matrix factorization. #39

Closed markroxor closed 6 years ago

markroxor commented 6 years ago

Consider a matrix V of 10000 rows and 500 columns, where each row represent words and each column represent documents. If we manage to factorize this matrix into two matrices of smaller dimensions, say W and H then the operations incurred to perform text processing on the two smaller matrices is very less compared to the larger matrix V.

Related paper - https://arxiv.org/pdf/1401.5226.pdf

markroxor commented 6 years ago
dikshant2210 commented 6 years ago

Is it same as non negative sampling in word2vec

markroxor commented 6 years ago

No. Visit this for details.

markroxor commented 6 years ago

@dikshant2210 are you talking about negative sampling or non negative sampling?

markroxor commented 6 years ago

screenshot from 2018-01-02 13-45-55

markroxor commented 6 years ago

The (i, j)th entry of the matrix X could for example be equal to the number of times the ith word appears in the jth document in which case each column of X is the vector of word counts of a document; in practice, more sophisticated constructions are used, e.g., the term frequency - inverse document frequency (tf-idf). This

markroxor commented 6 years ago

Algorithms for Non-negative Matrix Factorization https://papers.nips.cc/paper/1861-algorithms-for-non-negative-matrix-factorization.pdf