Open TrungThanhTran opened 4 years ago
@TranTony I found that the function expands visual
(ie, repeats the tensor beam_size
times) at the first step and for subsequent steps the visual
fed as input is the same as the output of the function call.
Example:
If I feed in visual
as a FloatTensor of size(4, 50, 2048)
(b_s, seq_len, d_input) and beam_size=5
than self._expand_visual
returns a FloatTensor of size (20, 50, 2048)
(b_s beam_size, seq_len, d_input)* at the first step during beam search.
For subsequent steps of beam search, visual
of shape (20, 50, 2048)
, as expected, is fed as an argument to self._expand_visual
and output tensor generated is the same as the input.
old_visual = visual
visual = self._expand_visual(visual, cur_beam_size, selected_beam)
print(torch.equal(old_visual, visual))
>> True
Did I miss out anything? Also, how did you intend to speed up beam search?
I reduce the beam_size to 1 or 2 and I found out that it achieves the same result. However, I applied to an auto annotation problem which generates about 50 words at a time. I don't think you need to reduce it. Plus, I reduce the number of connections and layers of encoder and decoder, too. About the visual, yes, its outcome is the same as your output.
Hi @baraldilorenzo,
I 'm trying to improve the speed of beam_search. When doing it, I found this function:
visual = self._expand_visual(visual, cur_beam_size, selected_beam)
in the iter function of beam_search.pyPlease tell me what does this mean?
T.T.T