I saw 'generate a sequence from the start state s0 to maximize its expected end reward ...' in the paper. I'm wondering what the s0 exactly mean? In the code, I see the START_TOKEN=0 and h0=zeros,which one is start state?
@shaomai00 I think START_TOKEN here is the first input to generator's LSTM model and furthermore h0 represents the initialization of hidden state and cell state in LSTM structure.
I saw 'generate a sequence from the start state s0 to maximize its expected end reward ...' in the paper. I'm wondering what the s0 exactly mean? In the code, I see the START_TOKEN=0 and h0=zeros,which one is start state?