Generic input layer - Githubissues

IndicoDataSolutions / Passage

A little library for text analysis with RNNs.

MIT License

530 stars 134 forks source link

Generic input layer #3

Closed kjancsi closed 9 years ago

kjancsi commented 9 years ago

While I understand that the library's main purpose is text processing, it would be great to have a generic input layer as well for sequences of real-valued input vectors.

Newmu commented 9 years ago

Yeah, makes a lot of sense, have this locally already as a dummy input layer that exposes input directly the model.

Update this weekend will have this and a linear iterator to go through datasets like that!

kjancsi commented 9 years ago

Thanks, sounds good. Any estimate when this would be available?

kjancsi commented 9 years ago

Hello, just checking on progress: any idea when this feature will land in the code? Thanks.

Newmu commented 9 years ago

Sorry for the delay was on other projects. Feature added with https://github.com/IndicoDataSolutions/Passage/pull/22 Example usage applying an RNN to mnist (reading left to right) here - achieves accuracy comparable or slightly better than a fully connected network.

kjancsi commented 9 years ago

No problem at all. Thanks a lot for adding this feature and including the MNIST example. Passage is shaping up to be a really nifty tool for RNNs. One question regarding the implementation: does this work for variable-length sequences as well?

Newmu commented 9 years ago

Thanks!

Currently Passage needs the iteration dimension to be the same length for all sequences being trained on in a minibatch. Currently we handle that for text by padding sequences with a "PAD" token so the model can just learn to deal with it. For real valued data, the simplest option is to zero pad the beginning of all sequences out to the same length, would this work for you?

kjancsi commented 9 years ago

Thanks, I'll give it a go. For text does the padding contribute to the loss function or do you use a mask for the loss calculation?

Newmu commented 9 years ago

When doing non-sequence prediction, padding the input won't effect loss calculation. For sequence prediction, we currently pad which is definitely not optimal. Mask support is a significant re-factoring that's in progress and one of the main reason why sequence prediction support is on it's own branch / not pushed to master yet.