dotnet / infer

Infer.NET is a framework for running Bayesian inference in graphical models
https://dotnet.github.io/infer/
MIT License
1.54k stars 229 forks source link

Recommender system tutorial evidence #449

Closed makobot-sh closed 4 months ago

makobot-sh commented 8 months ago

Hi everyone, I've been trying to get the evidence for the model from the Recommender System Tutorial in the docs but I haven't been able to figure out how. I tried using mixture modelling as described in Computing model evidence for model selection but the resulting evidence has been very small (near 0). I'm not sure I'm doing things right so if anyone could give me feedback on the code I'd appreciate it enormously. Here is what I did (look at the RecommenderTutorialFromRepository.Evidence function).

I'm getting
evidence 0
log(evidence) -2,57E+010

Despite using the following arguments for the data generation:

static int numUsers = 50;
static int numItems = 10;
static int numTraits = 2;
static int numObs = 100;
static int numLevels = 2;
And having been able to reproduce the results from the tutorial with said arguments: true parameters learned parameters
1,00 0,00 1,00 0,00
0,00 1,00 0,00 1,00
-0,42 0,73 -0,23 -0,07
-0,06 -0,03 -0,42 -0,04
0,80 -0,92 0,04 0,86

As an aside, I haven't been able to reproduce the results at the end of the tutorial (20000 observations instead of 100). The arguments for that one are the following (you can copy-paste onto my code to test, at RecommenderTutorialFromRepository.cs:17):

static int numUsers = 200;
static int numItems = 200;
static int numTraits = 2;
static int numObs = 20000;
static int numLevels = 2;

The estimated item traits don't match the truth or the tutorial's results at all. Instead, I'm getting the following estimations:

true parameters learned parameters
1,00 0,00 1,00 0,00
0,00 1,00 0,00 1,00
0,44 -1,07 3,01 2,80
-0,38 -0,83 3,01 2,80
0,11 0,68 3,03 2,82

Any ideas why this could be?

tminka commented 7 months ago
  1. To compute evidence correctly, you must put the If block around the whole model, especially the parameter declarations. The linked code doesn't do that.
  2. You are right, the results aren't good anymore. To debug this problem, I tried a few things and finally I set engine.Compiler.OptimiseInferenceCode = false; and it worked. So it seems there is a bug in how Infer.NET is optimizing the inference code. Thanks for bringing this to my attention.
makobot-sh commented 7 months ago

Thank you so much for the response! Turning off code optimization improved the estimatations in both cases, and putting the if block around the whole model has definitely improved the geometric mean of the evidence for the 100 observations case (which is now 0.65, calculated $exp(\frac{log(evidence)}{numObs})$ ).

For the 20k observations case, however, the geometric mean evidence (and the evidence itself) is still 0. Do you have any idea why this could be? (Here's the amended code, also a fork with the code to reproduce on branch tutorial_prints in case that's more convenient, and a diff against the original infer repo code)

On a different note, would the bug affect the full Matchbox Recommender implementation as well? How can we change said configuration for it if so?

tminka commented 6 months ago

It seems that toggling OptimiseInferenceCode doesn't fix everything, it just makes some of the estimates better. I will look into a better solution. The full Matchbox Recommender implementation is not affected.

makobot-sh commented 6 months ago

I see, if you do please let me know! It would be nice to get the evidence for a trained Matchbox Recommender, but as far as I could tell it's not possible. Thanks for everything!

tminka commented 5 months ago

This is fixed by #457

makobot-sh commented 5 months ago

Thank you Tom! The tutorial works great now for both values of largeData. I was playing around with it a bit more and found that with bigData=true and 5+ traits instead of 2, the geometric mean of the evidence drops back down to 0 (and the "Evidence is too low" exception triggers). Is this expected behaviour? With bigData=false, 5 traits seem to work well (the geometric mean of the evidence is below randomness, but this could be attributed to the low number of observations).

tminka commented 4 months ago

Damping is required for large numbers of traits. The attached PR shows how to do this.