haochen-rye / HNeRV

Official Pytorch implementation for HNeRV: a hybrid video neural representation (CVPR 2023)
https://haochen-rye.github.io/HNeRV/
114 stars 15 forks source link

Architecture #13

Closed lvhyang closed 5 months ago

lvhyang commented 5 months ago

Hello, I noticed that the HNeRV in your code seems to differ slightly from the Architecture in the paper. In the paper, first use convnext for encoder, then use a learning based embed, and finally use decoder. As shown in the following figure. ![Uploading 7ac48adaab7dceeab45206a4e618c8c.png…]() But in the code, I found that you use optional positional encoding for embeddings, and then use convnext for encoder, followed by decoder, as shown in the following figure. Why is there a line of code img_embed=self. encoder (input)? What part of the code for learning based small embeddings? image