double forward in goal gpt

Hi! I noticed one more not straightforward thing in goal conditioned version of GPT.

Here: https://github.com/jannerm/trajectory-transformer/blob/e0b5f12677a131ee87c65bc01179381679b3cfef/trajectory/models/transformers.py#L288-L295

After you append goal embeddings to the main sequence, you do self.blocks twice. Is that how it's intended to work? Shouldn't one time be enough, since all embeddings will have all needed information about the goal due to the attention mechanism.

jannerm / trajectory-transformer

double forward in goal gpt #5