Closed a-agmon closed 3 years ago
I noticed the same issue. I tried treating this as one column with a complex
modifier which makes the result different but not sure why there is a difference. Waiting for the answer on this thread.
@a-agmon @kmichael08
Great question, thanks for asking!
Cleora does not work in a way similar to Matrix Factorization & similar models. The problem you've noticed is actually a geometry-preserving feature of Cleora ;-)
Notice that your graph is bipartite - and this property is captured in the embeddings you get as a result (which can be seen on your 2D projection). Users can easily be similar to each other - if they interact with similar items. Items can easily be similar to each other - if they interact with similar users. But for an item to be similar to a user is hard, because (while they may be directly linked by an edge), their Nth degree neighborhood landscapes are entirely different. In every iteration a node's embedding is replaced by an L2 normalized average embedding of its neighbors, so you can probably imagine what happens in the case of a bipartite graph - a space swapping effect.
To get the behavior you expect, you can pick any of the 3 options:
K
and user embeddings with iteration K+1
(users are aggregates composed of items)K+1
and user embeddings with iteration K
(items are aggregates composed of users)K
and K+1
for both users and items (something in between)This can definitely work.
The approach of jointly modeling user-item interactions is an idea transplanted from MF algorithms. It sucks for a few reasons:
If you have 1K different items, you can reasonably expect a 32-dim vector to be able to capture item similarities and dissimilarities. But if a user has interacted with a K-item subset out of those 1K, you can't reasonably expect a 32-dim vector to be able to express all the possible subsets via simple addition/averaging. You'd need a strongly non-linear transform to compress this information.
For user-item recommendation purposes, we're feeding Cleora embeddings into EMDE to aggregate user profiles. You can check out some example code here.
General remark: for any reasonable purposes --dimension 32
is usually too low. Cleora is fast, --dimension 1024
would be a reasonable starting point.
Does this clarify the issue?
Best regards,
Jacek
Thank you very much for the detailed answer @ponythewhite - your comment is very helpful. I will surely check you sample code using EMDE.
Hello, Thank you very much for this work. The performance of your algorithm is stunning! We are testing Cleora for a user-items embedding task. I have run into some result and wondering if this is by design or my mistake. My TSV file is simple and follows the format of "user item"
u1 <\t> i1 u2 <\t> i1 u1<\t> i2 u3 <\t> i2
As you can see, the relation between users and items is many to many. Im running a simple embedding task
./cleora --input ~/test.tsv --columns="users items" --dimension 32 --number-of-iterations 4
In the resulting embeddings it seems that users and items are "remote" from each other, as in the image below (cluster 0 is users and 1 is items). That is very different than cases in which we used simple matrix factorization, where we saw users are closer to the items they buy than other items, but here it seems that these relationships are somewhat lost. Does my question make sense? Is this result expected in this case?
Many thanks!