Explaining test cases in the tutorial

pierre-rouanet commented 10 years ago

Also warns (ValueError?) user when they want to use the same test cases with different environments.

jgrizou commented 8 years ago

I was about to open the same issue :) I spent a couple of hours trying to find out how it is made in the code and still I am not sure I understand. Here are a couple (a lot) of questions:

testcases are basically points to be evaluated?
they can either be in the motor (evaluating forward model) or sensory space (inverse model) ?
what is the 'delta' case in https://github.com/flowersteam/explauto/blob/master/explauto/evaluation/evaluation.py ?
an evaluation returns/logs the prediction error for each point in the testcases ?
in an experiment, the function evaluate_at does not take the mode as argument. It is set automatically to be either 'inverse' or 'delta', never forward. Is there a reason for that?
I understand that 'forward' seems to be still in beta? Is that still the case?
What about adding mode as an argument, defaulting to None. If None automatically choose.
What would be the best practise to use my own test function? For example a function that takes as input the environment, the sm_model at evaluate time and maybe the im_model at evaluate time? maybe adding a custom_evaluate_at() in experiment?
Finally, the uniform_sensor_testcases is really cool and useful for testcases. What is the difference between this https://github.com/flowersteam/explauto/blob/master/explauto/environment/testcase.py#L66 and this https://github.com/flowersteam/explauto/blob/master/explauto/environment/environment.py#L119 ? Which one do you recommend using?

Just pinging everyone here :) @sebastien-forestier @pierre-rouanet @clement-moulin-frier

sebastien-forestier commented 8 years ago

The 'delta' functionality was added to allow the learning of actions depending on contexts i.e. learning the forward model (m, s, dm) -> ds and inverse model (m, s, ds_goal) -> dm. I've written a notebook about that here. If we use delta actions, the evaluation is not the same (thus the 'delta' evaluation mode).

So I think giving mode as an argument to 'evaluate_at' is a good idea, with the current behavior as the default behavior.

clement-moulin-frier commented 8 years ago

Hi @jgrizou ,

Sorry for the late reply. Trying to answer your questions (in the same order you used):

1) Yes 2) Yes 3) See @sebastien-forestier answer, 4) It returns the errors. 5, 6, 7) At that time we only evaluated our models for inverse prediction, we were not concerned by forward ones. So we indeed didn't test well the forward evaluation. It's probably for this reason that there is no mode argument and it is certainly a good idea to add it. 8) I would rather code a method in your specific Environment class that generates the test set, then passing this test set to Evaluation as usual (see also my last paragraph below) 9) Hard to remember and this is weird indeed :/ I would rather use the one in Environment. Or test both to see if they are equivalent.

In summary, we have never been very happy with how the evaluation is coded, but never came with a better solution neither. Don't hesitate to recode this if you feel inspired, ideally trying to keep back-compatibility.

The fact is that the way one wants to generate testsets can be very different according to the situation: one might want to use a predefined testset, another one to generate it by calling the Environment, or another one using geometrical calculation (as I remember it was the case for a very high-dimensional arm where the best guess was that reachable points are within the unit circle). This is why we thought the best input to Evaluation is simply an array of test points, leaving the responsibility to generate it to the user.

flowersteam / explauto

Explaining test cases in the tutorial #19