Closed Ouya-Bytes closed 8 years ago
Hi @OuYag
Yes l_{t,i}
is a scalar. It represents the weight of the i-th location and the timestep-t for the attention model.
For the paper: x_t for a single data point is of shape (D,). We had D=1024 since we used googlenet features. It is a vector for a datapoint which is the average of the feature cube weighted by the location.
In the code: Since we use batches, this becomes (batchsize,D) for each minibatch of examples. It is represented by the variable `ctx` https://github.com/kracwarlock/action-recognition-visual-attention/blob/master/src/actrec.py#L278
Hello,Kracwarklock.I want to know is l(ti) reprensent a scalar?and what the shape of the x(t)=sum{l(ti)_X(t,i)},i=1....k_k, x(t) is a vextor?a martix? or a cube?