jacobkimmel / pytorch_convgru

Convolutional Gated Recurrent Units implemented in PyTorch
MIT License
191 stars 41 forks source link

hidden size and kernel size #2

Closed bragilee closed 5 years ago

bragilee commented 5 years ago

Thanks for your work. I am thinking whether the hidden size and kernel size in your examples are optimal after you test with experiments? Or it depends on our task completely?

Thanks. :)

jacobkimmel commented 5 years ago

hey, thanks for the kind words.

The hidden size & kernel in the example are just chosen randomly. I've done no experiments to optimize them for any particular data set.

In the literature, you'll find that these sorts of hyper-parameter choices are often made empirically (i.e. choose some nice round numbers). You want to ensure your model has enough capacity to represent the data you're feeding in, so there are some reasonable lower bounds you can set on possible values.

That said, it's near impossible to determine what particular parameter combinations will lead to optimal performance a priori.

bragilee commented 5 years ago

Got it. Thanks for your explanations.