Open Titolasanta opened 5 months ago
Thank you for your interest and your question. It may help to refer to the following piece of our implementation:
https://github.com/amazon-science/earth-forecasting-transformer/blob/7732b03bdb366110563516c3502315deab4c2026/src/earthformer/cuboid_transformer/cuboid_transformer.py#L3192-L3195
The decoder typically takes as input the outputs from each layer of the encoder, as well as a dummy input named initial_z
that matches the shape of the encoder's final layer output.
Hi, researching your paper I read that the used architecture utilizes a hierarchical Structure in a similar way as in the Hourglass transformer ( https://arxiv.org/abs/2110.13711 ). But for your case you are utilizing a encoder-decoder transformer, so how do you reshape what would usually be the input for your decoders which consists of both the output of the encoder which I would consider to be downsampled and the original input, which would be in the original input shape?
Thank you