lmjohns3 / theanets

Neural network toolkit for Python
http://theanets.rtfd.org
MIT License
328 stars 73 forks source link

trial for pull request #66

Closed hknerdgn closed 9 years ago

hknerdgn commented 9 years ago

I tried to do the pull request for the masking branch.

lmjohns3 commented 9 years ago

I've been mulling this over and had a thought this week that might be interesting to discuss. Instead of making a separate masked version of each of the recurrent networks, what about adding a "mask" flag to all network classes? This would allow, e.g., matrix completion with missing values in the case of a non-recurrent autoencoder (i.e., the loss would be computed only on the non-missing values), and probably some other interesting masked applications -- in addition to supporting masks for recurrent models. It would also make the interface for feedforward and recurrent networks the same, and prevent the need for a duplicated module.

hknerdgn commented 9 years ago

I think this may be a good idea. I am not sure how often one needs to use masks in general. If it slows things down a lot, then it may not be worth it. Otherwise, it is a nice feature to have.

The way you implemented it as an sub-indexing mask may make it faster to run in theano but it is worth to test it, just to be sure.

Hakan

On Sat, Feb 21, 2015 at 2:16 PM, Leif Johnson notifications@github.com wrote:

I've been mulling this over and had a thought this week that might be interesting to discuss. Instead of making a separate masked version of each of the recurrent networks, what about adding a "mask" flag to all network classes? This would allow, e.g., matrix completion with missing values in the case of a non-recurrent autoencoder (i.e., the loss would be computed only on the non-missing values), and probably some other interesting masked applications -- in addition to supporting masks for recurrent models. It would also make the interface for feedforward and recurrent networks the same, and prevent the need for a duplicated module.

— Reply to this email directly or view it on GitHub https://github.com/lmjohns3/theanets/pull/66#issuecomment-75387528.

lmjohns3 commented 9 years ago

Thanks so much for working on this!

After getting another issue about weighting different examples, I've gone and updated the branch to use your approach of using weights (instead of sub-indexing). It might be a bit slower but seems more useful (sub-indexing is just a special case where weights are restricted to the set {0, 1}).

I have also incorporated your batch-extraction code into lstm_chime.py, it seems to be running perfectly!

I think I'm not going to merge the rest of this PR. Weighting is available for all models by specifying weighted=True when creating a network, and I'd rather keep the currennt model comparison as a sort of experiment rather than a theanets example. Feel free to reopen if you think we really ought to include more of this PR.

hknerdgn commented 9 years ago

Thanks a lot for incorporating these things into theanets. This weighting feature is very nice to have.

I think it would be nice to have an easy to use "gradient checking" option in theanets where one can check the accuracy of gradients by using forward computations only by using a numerical approximation to the gradient. I think theano makes this easy to do.

I mentioned this in one of my earlier posts. Should I add it as a feature request?

Thanks again.

Hakan

On Thu, Mar 26, 2015 at 11:48 AM, Leif Johnson notifications@github.com wrote:

Thanks so much for working on this!

After getting another issue about weighting different examples, I've gone and updated the branch to use your approach of using weights (instead of sub-indexing). It might be a bit slower but seems more useful (sub-indexing is just a special case where weights are restricted to the set {0, 1}).

I have also incorporated your batch-extraction code into lstm_chime.py, it seems to be running perfectly!

I think I'm not going to merge the rest of this PR. Weighting is available for all models by specifying weighted=True when creating a network, and I'd rather keep the currennt model comparison as a sort of experiment rather than a theanets example. Feel free to reopen if you think we really ought to include more of this PR.

— Reply to this email directly or view it on GitHub https://github.com/lmjohns3/theanets/pull/66#issuecomment-86584701.