Closed csyanbin closed 5 years ago
Thanks for the reply.
In P module,
H-stream -- res-block -- 2fcs
O-stream -- res-block --2 fcs
SP-stream -- conv -- 2fcs
In C module,
H-stream -- res-block -- 4fcs
O-stream -- res-block -- 4fcs
S-stream -- conv -- 2fcs
Is this structure used in the paper?
Yeah, notably if you use a larger C and train P and C jointly. The convergence may not be synchronous. Usually, C need more time to achieve better performance.
Really wonderful work!
I have 2 questions about the architecture details. (1 ) In Sec 5.2, "Relatively, the spatial stream is composed of two convolutional layers with max pooling, and two 1024 sized FCs", two 1024FCs are used in the spatial stream of C. In Figure3, it seems 4 are used as the H and O stream.
Which one is correct?
(2) In iCAN paper, the Residual Block has 2048 channels. In my understanding, you used 1024 in your paper instead and followed by 4 1024 FCs. Am I right?