Condition vector size different than LSTM hidden states?

Hi there,

Thank you for this wonderful repo. I just wanted to drop a theoretical question. Since we expect a smaller size of condition vector size (let's say size of 10) and bigger lstm layer (lets says 64 hidden states), then how could we initialize very first time step of LSTM layer? I think by mapping the vector as in ''𝑣⃗ =𝐖𝑥⃗ +𝑏⃗ '' you secure same dimensions, however would that not cause lot of zeros in the matrix?

Warm Regards.

philipperemy / cond_rnn

Condition vector size different than LSTM hidden states? #49