yoonkim / CNN_sentence

CNNs for sentence classification
2.05k stars 826 forks source link

Why initialize W[0] with all 0s? #6

Open allenanie opened 9 years ago

allenanie commented 9 years ago

Hi, I'm just having some trouble understanding the process_data.py file, especially saving a special W[0] word and initialize idx_map starting at 1. What's the purpose of doing that?

csong27 commented 9 years ago

Because he needs to pad sentence with zero vectors. See this function: get_idx_from_sent

Imane0 commented 7 years ago

Why padding the beginning of all sentences with the same number of zeros (filter_h=5 in function get_idx_from_sent) ? and how this number is set ? I guess this is related to the maximum region size / height used ? Another question is why function 'get_idx_from_sent' adding 0 until a length of max_l + 2*pad is reached where pad = filter_h - 1 ? Why not until max_l + pad only ?

csong27 commented 7 years ago

Each sentence has different number of words and thus the inputs to the CNN would have different sizes. Therefore, padding is needed to ensure all inputs have the same size.

Imane0 commented 7 years ago

I got that. I'm asking why extending the length of all sentences to max_l + 2*pad and not just to max_l ?