Refractor Inference methods to not expose the Factor class

jpoeppel commented 7 years ago

Returning Factors when one is expecting probabilities might be confusing (especially since the actual distribution needs to be queried with "get_potentials" not get_probability). Adding functions that imply that the factor contains probabilities is problematic since a factor cannot always know what kind of "probability" it represents (joint or conditional), so we cannot even normalize the potentials in order to get a valid probability in every case.

One solution might be to consider the Factor class as strictly internal for computations and change the interfaces of all inference methods to return different objects. One could consider returning low level np.arrays, or alternatively create a wrapper object, structurally similar to the current Factor class, but which exposes an interface implying actual probabilities and convenience functions. Upside of this approach would be easier for naive users to understand the returned results, and the option to specialize factors more for computational efficiency. Downside of this approach is the loss of flexibility with low level np.arrays or the complexity added by another class which needs to be created (should be ok since it is only done once per query at the end)

hbuschme commented 7 years ago

I like the approach with the wrapper object. This could also return the low level np.array upon request.

I think that it is both useful to be able to request the probability of a value of a random variable without knowing anything about their ordering (similarly to requesting a dictionary value), as well as to be able to get a posterior marginal distribution as an np.array.

jpoeppel commented 7 years ago

Ok so we would have a new class (maybe named Marginals?) which contains a distribution (for now np.array, might need to change when we reintroduce continuous variables) and managing structure for the included variables and their values (similar to the Factor class).

An interface could look like this:

get_marginal_probabilities(variable): Returns a dictionary containing the value:probability pairs of that variable (if it is included in this marginal)
[TBD]: Basically the same but as a numpy array, maybe using an optional parameter?
get_(marginal)_probability(variable, value): Returns just the probability for a certain value of the specified variable

Since the class could be named marginals, we could omit the "marginal" in the function names as well.

Any other public methods the class should expose?

hbuschme commented 7 years ago

See pull request #10.

SocialCognitiveSystems / PRIMO

Refractor Inference methods to not expose the Factor class #7