Open allenanie opened 9 years ago
Because he needs to pad sentence with zero vectors. See this function: get_idx_from_sent
Why padding the beginning of all sentences with the same number of zeros (filter_h=5 in function get_idx_from_sent) ? and how this number is set ? I guess this is related to the maximum region size / height used ? Another question is why function 'get_idx_from_sent' adding 0 until a length of max_l + 2*pad is reached where pad = filter_h - 1 ? Why not until max_l + pad only ?
Each sentence has different number of words and thus the inputs to the CNN would have different sizes. Therefore, padding is needed to ensure all inputs have the same size.
I got that. I'm asking why extending the length of all sentences to max_l + 2*pad and not just to max_l ?
Hi, I'm just having some trouble understanding the process_data.py file, especially saving a special W[0] word and initialize idx_map starting at 1. What's the purpose of doing that?