dengyang17 / unicorn

The implementation of Unified Conversational Recommendation Policy Learning via Graph-based Reinforcement Learning (SIGIR 2021).
45 stars 8 forks source link

Understanding the data #5

Closed dayuyang1999 closed 2 years ago

dayuyang1999 commented 2 years ago

Hi,

I found for pretrained graph embedding data in tmp/last_fm/embed/transe.pkl, "transe.pkl" is a dictionary instead of an array.

It has 2 keys: "u_i_embed" and "feature_embed".

However, when I tried to reproduce pretraining embedding by myself using OpenKE, the outcome of the package after pretraining on transe is actually an array with dimension (num of node, dim of embed).

I am wondering how you get the transe data as a dictionary? And what is the meaning of the 2 items in it ("u_i_embed" and "feature_embed".)?

Thanks!

dengyang17 commented 2 years ago

Hi,

After you get the pretrained embeddings as an array with dimension (num of node, dim of embed), you can split them using the entity id into two set, (i) "u_i_embed" that contains users and items, and (2) "feature_embed" that contains attributes. Then you can combine them as two key-value pairs into one dictionary. Finally, you can use the package "pickle" to dump this dictionary into the "transe.pkl" file.

Thanks and regards.