Are the depth embedding used at every transformer block of TreeGen or only at the input?

zysszy / TreeGen

A Tree-Based Transformer Architecture for Code Generation. (AAAI'20)

MIT License

90 stars 27 forks source link

Are the depth embedding used at every transformer block of TreeGen or only at the input? #17

Open brando90 opened 3 years ago

brando90 commented 3 years ago

Hi Zeyu,

I was wondering, do you use a depth embedding that is a function of the block transformer number b? Or at the very least always apply the depth embedding at the beginning of every input block or only at the very very very beginning (i.e. only once)?

Thanks in advance for the patience!

zysszy commented 3 years ago

Hello,

We use depth embedding at the beginning of every input block.

Zeyu