predictry / neumann

Neo4j based Service Provider
0 stars 0 forks source link

Check bought also bought algo #134

Closed st3wart closed 9 years ago

st3wart commented 9 years ago

On Soukai now there's hardly any recommendations for bought also bought. To review this.

st3wart commented 9 years ago

@jocki do remember to check this after the deletion as Soukai is using it

jocki commented 9 years ago

I think this is because the amount of collected buy events is very small. I'll try to describe what I do to check in the following:

Currently, number of distinct items that has recommendations is 140 items, based on the following query:

MATCH (s:Session:`SOUKAIMY`)-[:BUY]->(item:Item:`SOUKAIMY`),
    (s)-[:BUY]->(recommendedItem:Item:`SOUKAIMY`)
WHERE item <> recommendedItem
WITH item, recommendedItem, COUNT(recommendedItem) AS numOfRecommendedItems
WHERE numOfRecommendedItems > 0
RETURN COUNT(DISTINCT item)

Neumann queries for another item that is purchased within the same session. To find number of sessions that buy more than one items within the same session:

MATCH (s:Session:`SOUKAIMY`)-[:BUY]->(item:Item:`SOUKAIMY`) WITH s, COUNT(item) AS numOfPurchasedItems WHERE numOfPurchasedItems > 1 RETURN COUNT(s)

The result is only 119 sessions (or buy transactions). If most of these sessions buy the same popular items repeatedly, number of recommendations will be more smaller.

We have the following options to increase the number of recommendations:

  1. Keep collecting buy actions and see if the results becomes better.
  2. Increasing the span of sessions so different purchases become the same purchase (or, the same session). While this increase the number of recommendations, it will decrease the accuracy.
st3wart commented 9 years ago

hmm,

if i buy milo and nescafe in one session

You buy milo and dutch lady in one session

Would the rec for milo be nescafe and dutch lady?

On Tue, Sep 22, 2015 at 12:16 PM, jocki notifications@github.com wrote:

I think this is because the amount of collected buy events is very small. I'll try to describe what I do to check in the following:

Currently, number of distinct items that has recommendations is 140 items, based on the following query:

MATCH (s:Session:SOUKAIMY)-[:BUY]->(item:Item:SOUKAIMY), (s)-[:BUY]->(recommendedItem:Item:SOUKAIMY) WHERE item <> recommendedItem WITH item, recommendedItem, COUNT(recommendedItem) AS numOfRecommendedItems WHERE numOfRecommendedItems > 0 RETURN COUNT(DISTINCT item)

Neumann queries for another item that is purchased within the same session. To find number of sessions that buy more than one items within the same session:

MATCH (s:Session:SOUKAIMY)-[:BUY]->(item:Item:SOUKAIMY) WITH s, COUNT(item) AS numOfPurchasedItems WHERE numOfPurchasedItems > 1 RETURN COUNT(s)

The result is only 119 sessions (or buy transactions). If most of these sessions buy the same popular items repeatedly, number of recommendations will be more smaller.

We have the following options to increase the number of recommendations:

  1. Keep collecting buy actions and see if the results becomes better.
  2. Increasing the span of sessions so different purchases become the same purchase (or, the same session). While this increase the number of recommendations, it will decrease the accuracy.

— Reply to this email directly or view it on GitHub https://github.com/predictry/neumann/issues/134#issuecomment-142177906.

jocki commented 9 years ago

I tried to test them in my local Neo4j:

image

After running the same Cipher query that is issued by Neumann, I got the following the result for recommendation for Milo:

image

Looks like the recommendation for milo includes nestcafe and dutch lady. Btw, Gui's code uses the same Cipher query for oivt too. The only difference is that VIEW relationship is replaced by BUY relationship.

st3wart commented 9 years ago

The sessions that you added are from different users or the same?

On Tue, Sep 22, 2015 at 3:58 PM, jocki notifications@github.com wrote:

I tried to test them in my local Neo4j:

[image: image] https://cloud.githubusercontent.com/assets/12497507/10013236/24ee3032-6142-11e5-9f42-2c9f3cd84e5b.png

After running the same Cipher query that is issued by Neumann, I got the following the result for recommendation for Milo:

[image: image] https://cloud.githubusercontent.com/assets/12497507/10013253/53c9b642-6142-11e5-929d-21e708bf47f4.png

Looks like the recommendation for milo includes nestcafe and dutch lady. Btw, Gui's code uses the same Cipher query for oivt too. The only difference is that VIEW relationship is replaced by BUY relationship.

— Reply to this email directly or view it on GitHub https://github.com/predictry/neumann/issues/134#issuecomment-142206619.

jocki commented 9 years ago

I don't know that. Neumann only imports session from Tapirus. Perhaps Tapirus adds the session or the JavaScript frontend tracks it. I'll try to inspect the code in Tapirus to find the answer. Never touch that repository yet.

st3wart commented 9 years ago

I mean in your local pc. Did you add the sessions for the same user or different user?

On Tue, Sep 22, 2015 at 4:06 PM, jocki notifications@github.com wrote:

I don't know that. Neumann only imports session from Tapirus. Perhaps Tapirus adds the session or the JavaScript frontend tracks it. I'll try to inspect the code in Tapirus to find the answer. Never touch that repository yet.

— Reply to this email directly or view it on GitHub https://github.com/predictry/neumann/issues/134#issuecomment-142207727.

jocki commented 9 years ago

From what I see, the current query only compute per Session node, it ignores User node at all. Every different Session will be separate transactions regardless their associated User.

st3wart commented 9 years ago

If you traverse up to the user node instead of using session node, how does the rec look like?

On Tue, Sep 22, 2015 at 4:47 PM, jocki notifications@github.com wrote:

From what I see, the current query only compute per Session node, it ignores User node at all. Every different Session will be separate transactions regardless their associated User.

— Reply to this email directly or view it on GitHub https://github.com/predictry/neumann/issues/134#issuecomment-142216259.

jocki commented 9 years ago

I'll try to do a quick test by issuing simple query. The current implementation yields the following result:

image

If I change the query to include User node like:

MATCH (u:SOUKAIMY:User)<-[:BY]-(:SOUKAIMY:Session)-[:BUY]->(i:SOUKAIMY:Item)
WITH u,i
MATCH (u)<-[:BY]-(:SOUKAIMY:Session)-[:BUY]->(x:SOUKAIMY:Item)
WHERE x <> i
RETURN i.name, COUNT(x) AS numOfRecommendations

The results is:

image

Number of items that have recommendation increased from 139 items into 176 items. And, the number of recommended items for every item increased drastically. This is because now it includes history of purchased items (all items that has been purchased by the user) rather than item that is bought together in the same cart (per transaction).

st3wart commented 9 years ago

OK cool let's replace that.

This one we should use it for frequently bought together. On 22 Sep 2015 11:30, "jocki" notifications@github.com wrote:

I'll try to do a quick test by issuing simple query. The current implementation yields the following result:

[image: image] https://cloud.githubusercontent.com/assets/12497507/10014879/e9a9b9a8-614e-11e5-8e4a-45eb2259d1ba.png

If I change the query to include User node like:

MATCH (u:SOUKAIMY:User)<-[:BY]-(:SOUKAIMY:Session)-[:BUY]->(i:SOUKAIMY:Item) WITH u,i MATCH (u)<-[:BY]-(:SOUKAIMY:Session)-[:BUY]->(x:SOUKAIMY:Item) WHERE x <> i RETURN i.name, COUNT(x) AS numOfRecommendations

The results is:

[image: image] https://cloud.githubusercontent.com/assets/12497507/10014915/1fc2d506-614f-11e5-8534-ed4748f775ae.png

Number of items that have recommendation increased from 139 items into 176 items. And, the number of recommended items for every item increased drastically. This is because now it includes history of purchased items (all items that has been purchased by the user) rather than item that is bought together in the same cart (per transaction).

— Reply to this email directly or view it on GitHub https://github.com/predictry/neumann/issues/134#issuecomment-142225957.

jocki commented 9 years ago

Do you mean to replace the query to include User node? Ok, I'll work on that today.

jocki commented 9 years ago

After examining recommend.py, it turns out that @guidj has created another algorithm type called oiv and oip that includes User node. So, can we just switch from oipt to oip?

For complicity, I'll list algorithm types that is currently known by Neumann:

  1. oivt, oipt: based on session
  2. oiv, oip: based on user
  3. anon-oiv, anon-oip: based on agent
  4. duo: similiar like oivt and oipt but treats BUY and VIEW relationship as same.
st3wart commented 9 years ago

Great! OK oip for oipt. On 23 Sep 2015 05:25, "jocki" notifications@github.com wrote:

After examining recommend.py, it turns out that @guidj https://github.com/guidj has created another algorithm type called oiv and oip that includes User node. So, can we just switch from oipt to oip?

For complicity, I'll list algorithm types that is currently known by Neumann:

  1. oivt, oipt: based on session
  2. oiv, oip: based on user
  3. anon-oiv, anon-oip: based on agent
  4. duo: similiar like oivt and oipt but treats BUY and VIEW relationship as same.

— Reply to this email directly or view it on GitHub https://github.com/predictry/neumann/issues/134#issuecomment-142479981.

jocki commented 9 years ago

Ok, I'll start scheduling recommendation for algo oip for tenant SOUKAIMY.

st3wart commented 9 years ago

ah so it's going to be a separate thing

a folder for oip in s3?

On Wed, Sep 23, 2015 at 2:10 PM, jocki notifications@github.com wrote:

Ok, I'll start scheduling recommendation for algo oip for tenant SOUKAIMY.

— Reply to this email directly or view it on GitHub https://github.com/predictry/neumann/issues/134#issuecomment-142504920.

jocki commented 9 years ago

yup, a separate folder in S3. @guidj created 7 types of algo in Neumann.

st3wart commented 9 years ago

ok. let me know once you have it thanks

On Wed, Sep 23, 2015 at 2:22 PM, jocki notifications@github.com wrote:

yup, a separate folder in S3. @guidj https://github.com/guidj created 7 types of algo in Neumann.

— Reply to this email directly or view it on GitHub https://github.com/predictry/neumann/issues/134#issuecomment-142506688.

jocki commented 9 years ago

Recommendation for oip has been uploaded to S3.

st3wart commented 9 years ago

Thanks! Could you ask Jocki to it on a demo page?

By the name OIP?

On Fri, Sep 25, 2015 at 3:07 PM, jocki notifications@github.com wrote:

Recommendation for oip has been uploaded to S3.

— Reply to this email directly or view it on GitHub https://github.com/predictry/neumann/issues/134#issuecomment-143146240.

jocki commented 9 years ago

Do you mean @thanyawzinmin ?

st3wart commented 9 years ago

oops yea!

On Fri, Sep 25, 2015 at 3:52 PM, jocki notifications@github.com wrote:

Do you mean @thanyawzinmin https://github.com/thanyawzinmin ?

— Reply to this email directly or view it on GitHub https://github.com/predictry/neumann/issues/134#issuecomment-143151969.