try full multiheadattention, by concatenating the satellite and query and passing that in together - Githubissues

openclimatefix / power_perceiver

Machine learning experiments using the Perceiver IO model to forecast the electricity system (starting with solar)

MIT License

7 stars 1 forks source link

try full multiheadattention, by concatenating the satellite and query and passing that in together #40

Closed JackKelly closed 2 years ago

JackKelly commented 2 years ago

I think it should be pretty simple:

Get rid of the Perceiver. Instead just have a sequence of nn.TransformerEncoderLayers (stacked on top of eachother using same mechanism as in Perceiver: maybe create a separate nn.Module called MultiLayerTransformerEncoder which abstracts that out.)

Stack (byte_array, query).

Then the final output = attn_output[:, len(byte_array):]

Still TODO:

[x] Somehow allow us to use more heads. Need to make the byte_array a shape that can be exactly divided by num_heads
[x] Multiple layers!