sdobber / FluxArchitectures.jl

Complex neural network examples for Flux.jl
MIT License
122 stars 15 forks source link

Size of Input in LeNet5 of LSTNet #2

Closed luboshanus closed 4 years ago

luboshanus commented 4 years ago

Hi, I didn't find any other way to contact you.

Please, would you be willing to tell me how the network works in a way that you can input batches of different lengths without using reset!? Is it because the LeNet5() does some reshaping? I am not much familiar with convolutional networks. I'd like to forecast a basic AR model using just LSTM for example.

I am aware of many examples at Discourse, and it makes sense to use reset! on hidden states. But you do not in your function that is why I am asking.

Thanks!

sdobber commented 4 years ago

Hi! When training networks with batches of different sizes, you need to use Flux.reset! if the network contains hidden states somewhere whenever the batch size changes. In this repository, there will usually be a GRU or LSTM layer needing a reset. DSAnet does not have any hidden states, and thus should work without reset! - I just copied the example file and forgot to delete the lines ;-)

I hope this helps!

luboshanus commented 4 years ago

Hi, thanks a lot. I do understand now. And also an apology, I was reading many codes and did think wrongly that LeNet5 is your functions. It is in the model-zoo for Flux and it contains only convolutional layers. Thanks once more!