Open Justarrrrr opened 1 month ago
@Justarrrrr Sorry for the confusion, it probably should've been 7 layers for key-frame extractor in total.
I apologize for not replying to your email earlier, and I'm pleasantly surprised to receive your response! So, if I understand correctly, the original intention was to have 7 blocks, but the code implementation only includes 5 blocks, is that correct?
the paper says that use 7 residual blocks for key-frame feature extractor, but
it seems that only 5 residual blocks were used in the code