Raphaaal / fieldy

Fine-grained attention in hierarchical transformers for tabular time-series.
5 stars 0 forks source link

Regarding Position Encoding #1

Open isCopyman opened 1 month ago

isCopyman commented 1 month ago

The paper did not mention a specific location encoding method. I took a brief look at the code. May I ask if a learnable encoding method is being used.

Raphaaal commented 1 month ago

Hi,

Thank you for your interest in this work.

Yes, we used two standard learnable positional embeddings (BERT-like), one for the row position and one for the column index.

That is to say, a vector is learned for each position throughout training and added element-wise to the tokens embeddings.

These positional embeddings are initialized at the following locations: