zysszy / TreeGen

A Tree-Based Transformer Architecture for Code Generation. (AAAI'20)
MIT License
90 stars 27 forks source link

What does W_c[c1; ...; cM] notation in the paper mean for character embeddings? #23

Open brando90 opened 2 years ago

brando90 commented 2 years ago

The paper does not explain what the concatenation of vectors with the semicolon operation means. Can this be clarified?

Screen Shot 2021-12-22 at 3 50 25 PM

if we have n_i = W_c[c1; ...; cM], does this mean that W_c is a [1,M] vector and [c1; ...; cM] is a [M, D] matrix?

zysszy commented 2 years ago

The paper does not explain what the concatenation of vectors with the semicolon operation means. Can this be clarified?

Screen Shot 2021-12-22 at 3 50 25 PM

if we have n_i = W_c[c1; ...; cM], does this mean that W_c is a [1,M] vector and [c1; ...; cM] is a [M, D] matrix?

If we have n_i = W_c[c1; ...; cM], this mean that W_c is a [D, M D] matrix and [c1; ...; cM] is a `[M D, 1]vector (implemented bytorch.catortf.concat), whereD` is the hidden size.

brando90 commented 2 years ago

The paper does not explain what the concatenation of vectors with the semicolon operation means. Can this be clarified? Screen Shot 2021-12-22 at 3 50 25 PM if we have n_i = W_c[c1; ...; cM], does this mean that W_c is a [1,M] vector and [c1; ...; cM] is a [M, D] matrix?

If we have n_i = W_c[c1; ...; cM], this mean that W_c is a [D, M D] matrix and [c1; ...; cM] is a `[M D, 1]vector (implemented bytorch.catortf.concat), whereD` is the hidden size.

Thanks for the clarification. I think that is non-standard notation, perhaps you could write what that means in the paper next time? e.g. in the appendix if it's getting to long.

Thanks!

zysszy commented 2 years ago

The paper does not explain what the concatenation of vectors with the semicolon operation means. Can this be clarified? Screen Shot 2021-12-22 at 3 50 25 PM if we have n_i = W_c[c1; ...; cM], does this mean that W_c is a [1,M] vector and [c1; ...; cM] is a [M, D] matrix?

If we have n_i = W_c[c1; ...; cM], this mean that W_c is a [D, M D] matrix and [c1; ...; cM] is a `[M D, 1]vector (implemented bytorch.catortf.concat), whereD` is the hidden size.

Thanks for the clarification. I think that is non-standard notation, perhaps you could write what that means in the paper next time? e.g. in the appendix if it's getting to long.

Thanks!

Thanks for you advice~