lisa-lab / pylearn2

Warning: This project does not have any current developer. See bellow.
BSD 3-Clause "New" or "Revised" License
2.75k stars 1.09k forks source link

Implement tests for get_fixed_var_descr #359

Open lamblin opened 11 years ago

lamblin commented 11 years ago

In particular, the logic for merging different fixed variable descriptors for composite costs, such as SumOfCosts, is tricky. See https://github.com/lisa-lab/pylearn2/pull/355/files#L3R311

We should test that behavior itself, independently from the use that BGD makes of it.

lamblin commented 11 years ago

@vdumoulin: I'm keeping the milestone to August for that one because it will probably be longer than other tickets, but you can start working on it as soon as your other ticket is finished.

vdumoulin commented 11 years ago

I'm not sure I understand exactly what a fixed variable descriptor is; could you help me familiarize myself with that?

vdumoulin commented 11 years ago

In particular, could you give me an example of when you'd want some variable to stay the same across mini-batches?

lamblin commented 11 years ago

I'm not that familiar with it either, maybe @goodfeli can chime in to correct or clarify. What I understand is that some training algorithms, for instance conjugate gradient descent, have to evaluate the model on the same mini-batch more than once, with different parameters, for instance when doing a line search. In that case, we do not always want shared variables of the model (or the cost) to be automatically updated. For instance, if there is a stochastic component, we would want the random state to stay the same across consecutive calls on the same mini-batch. If there is a counter for the number of examples seen during training, for instance, it should not be incremented at each point of the line search. For the moment, only two Costs actually have fixed var descriptors, and they are test costs defined in pylearn2/training_algorithms/tests/test_bgd.py.

goodfeli commented 10 years ago

We need to solve #629 before it makes sense to write unit tests for this now

goodfeli commented 10 years ago

@vdumoulin , it's not across minibatches, it's for the lifetime of a minibatch, which might be longer than the lifetime of a theano function call. For example, suppose you wanted to use dropout with the BGD class, which does line searches. You don't want the dropout mask to change at every point on the line search, because you'd probably just end up picking the point on the line where you happened to sample the dropout mask with the fewest zeros in it.