gemcollector / PIE-G

This is the repo of NeurIPS 2022 paper: "Pre-Trained Image Encoder for Generalizable Visual Reinforcement Learning"
MIT License
12 stars 2 forks source link

Constructing the `conv` features from backbone #3

Open tungts1101 opened 6 days ago

tungts1101 commented 6 days ago

What is the meaning of this part when extracting features from the backbone? I cannot see any mention of this part in the paper.

conv_current = conv[:, 1:, :, :, :]
conv_prev = conv_current - conv[:, :time_step - 1, :, :, :].detach()
conv = torch.cat([conv_current, conv_prev], axis=1)
gemcollector commented 5 days ago

Hi there, we mention this in our paper: “Then, the embeddings from the second layer of the pre-trained model are fused as input features to the policy networks [64, 51].” This is the commonly used flare fusing operation in many works. You can check [64, 51] to get further information.