lyst / lightfm

A Python implementation of LightFM, a hybrid recommendation algorithm.
Apache License 2.0
4.73k stars 691 forks source link

Issues with pre-trained embedding as item features #554

Closed mohammad1234 closed 3 years ago

mohammad1234 commented 4 years ago

Thanks for this great library, i'm trying to pass my lightfm training model a pre-trained item embedding matrix (learned from word2vec) in the following way:

pre_trained_item_embedding_matrix = np.ndarray((n_items, num_features)).astype(np.float16)
for index, vector in pre_trained_embeddings:
    for feature in vector:
        pre_trained_item_embedding_matrix[index][feature] = vector[feature]
pre_trained_item_embedding_matrix = sparse.csr_matrix(pre_trained_item_embedding_matrix, dtype=np.float16)

model = LightFM(loss='warp', no_components=number_of_latent_features)
model.fit(interactions, epochs=10, item_features=pre_trained_item_embedding_matrix)

the model seems to finish training without any issue, but when i'm trying to get item embeddings from the lightfm model, i get an item embeddings matrix in completely wrong shape (i.e. number_of_pretrained_features, number_of_latent_features)

item_embeddings = model.get_item_representations()[1]

ami i passing the pretrained embedding matrix to lightfm model as item_featurs wrong ? Thanks in advance.