Closed aronwc closed 10 years ago
X matrix is currently computed by criterias.py
.
After a lot of database manipulations. I finally have the X matrix in a pickle file of 623 Mo. This is a sparse lil_matrix and it loads from file to memory in 1mn11s. Now i'll determine how to use the sparse matrix in the linear regression function.
Great - don't forget to also pickle the vocabulary, so we know which company/account pair each matrix cell refers to.
Done. See friendsMatrix.pkl and profileMatrix.pkl (and friendsMatrixFromDB.py and profileMatrixFromDb.py).
X is created by XMatrixFromDB.py
X = [company x follower proportions]
Y = [company x demographic breakdown]
E.g.
Here's a matrix of two companies, with three follower proportions each.
Here's a matrix of their demographic breakdown (%male, %female)