Question about the size [N, L, W, D]

JubSteven commented 11 months ago

Hi, thanks for your great work! I want to ask a question about the size of the tensor during the pipeline. According to the paper, for a sentence (or a phase) like "美国人民", it is first segmented into words like "美国", "美国人", "国人", etc. For each segmented word, we sort of add the word vector to each corresponding character of the word.

But in this way, how do we make sure that the dimension W is fixed? For example, the word "美" might have two words "美国", "美国人" accompanied with it, while "民" only has one word "人民".

Thanks for your patience!

huskydoge commented 11 months ago

same issue

liuwei1206 commented 11 months ago

Hi,

I am sorry to say that I kind of forgot the details. In my memory, the W seems to be a hyper-parameter. If a character has a matched word number less than W, then we can use padding.

Hope it helps.

JubSteven commented 11 months ago

Ok, thanks!

liuwei1206 / LEBERT

Question about the size [N, L, W, D] #64