grahamjenson / ger

Good Enough Recommendation (GER) Engine
376 stars 47 forks source link

Need some help with the API (docs seem outdated) #41

Closed mckapur closed 8 years ago

mckapur commented 9 years ago

I've been pursuing GER for around 2 days now (it's very suitable for my project) - the docs are pretty out of date so trying to get it to work and figuring out each component is pretty much trial and error + looking at the ger.coffee file and test files. This is basically it without all of my code (eg. fetching the actual data):

require('coffee-script/register');

var bb = require('bluebird');
var async = require('async');
var g = require('ger');

var ns = 'Contra';
var GER = g.GER;
var esm = new g.MemESM(ns, options = {});
esm.initialize(ns);
var ger = new GER(esm);

var userId = 'id';
var action = 'view';
var things = []; // Pretend there's a bunch of strings here....

// Things is an array of strings of "topics" eg. Apple, Google, Politics, Philosophy, Obama, etc.
for (var i = 0; i < things.length; i++)
    events.push(ger.event(ns, userId, action, things[i], {}));

bb.all(events).then(function() { // `events` array has 10,000 objects
    return ger.recommendations_for_person(ns, userId, action, {actions: {'view': 2, 'reply': 5, 'share': 3}});
}).then(function(recs) {
    console.log(recs);
});

OK, so the problem is that the recommendation array is empty. I get some neighbourhood object and a confidence of 0. I'm training on around 10,000 events sourced from my database. I tried logging the events with (find_events) but only 50 are returned (which I think is by design and there is a setting/config somewhere to change that). Though, when I log count_events only around ~2000 is returned, which again, is strange, because I trained with 10,000 events. I'm not sure, though, that this issue is causing the lack of any recommendations (2000 is still a lot to train on), it's just another issue I've run into. There's a lot of stuff that's ambiguous eg. how to set the action weights and where to do so, I did it based on your test code (not on your docs).

Thanks!

grahamjenson commented 8 years ago

Hey,

So I have updated the README and give an example to work from in the examples directory.

There are two reasons the above code doesn't work:

  1. A bug in the in memory ESM (now fixed in 0.0.83)
  2. A new assumption that all events that are also recommendations must have an expiry date (see reasoning in the README)

If you go over the example it will become clear how to fix the above code :)

In the In Memory ESM duplicate events are not stored, only the most recent (via created_at) event is stored. This may be the cause of the difference between 2000 and 10,000.

Also, the in memory ESM is mostly built as a sanity check, so events are not persisted. I would recommend moving the PostGres ESM or implementing a different ESM, before you get to production.

mckapur commented 8 years ago
require('coffee-script/register');

var ns = 'Contra';
var GER = g.GER;
var esm = new g.MemESM();
var ger = new GER(esm);

var userId = 'id';
var action = 'view';
var things = []; // Pretend there's a bunch of strings here....

// Things is an array of strings of "topics" eg. Apple, Google, Politics, Philosophy, Obama, etc.
for (var i = 0; i < things.length; i++)
    events.push({namespace: ns, person: userId, action: action, thing: thing, expires_at: '2020-02-02');

ger.initialize_namespace(ns).then(function() {
    return ger.events(events);
}).then(function() {
    return ger.recommendations_for_person(ns, userId, {actions: {'view': 5, 'like': 10}});
}).then(function(recommendations) {
    console.log(recommendations);
});

This follows your new documentation, but still returns an empty recommendations array. When I count events it returns around 900 (so I'm training on 900). Any suggestions?

grahamjenson commented 8 years ago

Hey,

I just ran this script

g = require('../ger')
var ns = 'Contra';
var GER = g.GER;
var esm = new g.MemESM();
var ger = new GER(esm);

var userId = 'id';
var action = 'view';
var things = ["Apple", "google", "politics"]; // Pretend there's a bunch of strings here....

events = []
// Things is an array of strings of "topics" eg. Apple, Google, Politics, Philosophy, Obama, etc.
for (var i = 0; i < things.length; i++){
  thing = things[i]
  events.push({namespace: ns, person: userId, action: action, thing: thing, expires_at: '2020-02-02'});
}

ger.initialize_namespace(ns).then(function() {
    return ger.events(events);
}).then(function() {
    return ger.recommendations_for_person(ns, userId, {actions: {'view': 5, 'like': 10}});
}).then(function(recommendations) {
    console.log(recommendations);
});

and I get back

{ recommendations: 
   [ { thing: 'google',
       weight: 1,
       last_actioned_at: '2015-07-11T10:18:48+01:00',
       last_expires_at: '2020-02-02T00:00:00+00:00',
       people: [Object] },
     { thing: 'politics',
       weight: 1,
       last_actioned_at: '2015-07-11T10:18:48+01:00',
       last_expires_at: '2020-02-02T00:00:00+00:00',
       people: [Object] },
     { thing: 'Apple',
       weight: 1,
       last_actioned_at: '2015-07-11T10:18:48+01:00',
       last_expires_at: '2020-02-02T00:00:00+00:00',
       people: [Object] } ],
  neighbourhood: { id: 1 },
  confidence: 0 }

So I think there is a problem with you actual code, not in the above code.