tapilab / ctrosset

0 stars 0 forks source link

Create feature matrix X and target matrix Y #6

Closed aronwc closed 10 years ago

aronwc commented 10 years ago

X = [company x follower proportions]

Y = [company x demographic breakdown]

E.g.

Here's a matrix of two companies, with three follower proportions each.

X = [[0.2, 0.1, 0.7],
       [0.5, 0.4, 0.1]]

Here's a matrix of their demographic breakdown (%male, %female)

Y = [[45.4, 54.6], 
       [46.7, 53.3]]
aronwc commented 10 years ago

X matrix is currently computed by criterias.py.

cyril94440 commented 10 years ago

After a lot of database manipulations. I finally have the X matrix in a pickle file of 623 Mo. This is a sparse lil_matrix and it loads from file to memory in 1mn11s. Now i'll determine how to use the sparse matrix in the linear regression function.

aronwc commented 10 years ago

Great - don't forget to also pickle the vocabulary, so we know which company/account pair each matrix cell refers to.

aronwc commented 10 years ago

Done. See friendsMatrix.pkl and profileMatrix.pkl (and friendsMatrixFromDB.py and profileMatrixFromDb.py).

X is created by XMatrixFromDB.py