v-iashin / MDVC

PyTorch implementation of Multi-modal Dense Video Captioning (CVPR 2020 Workshops)
https://v-iashin.github.io/mdvc
143 stars 20 forks source link

Bi-SST implementation #3

Closed amanchadha closed 4 years ago

amanchadha commented 4 years ago

Hello again!

From the paper:

Firstly, we obtain the temporal event locations. For this task, we employ the Bidirectional Single-Stream Temporal action proposals network (Bi-SST) proposed in [48]

Per [48], a Bi-SST goes through a forward and backward pass and then fuses the results with a fusion operation (typically multiplication). In "models/transformer.py", can you please point me to where this is implemented? Thanks!

v-iashin commented 4 years ago

Hi! Please see the response to Issue #2.