Dear yongye:
Thank you for providing such excellent code.
I encountered a problem when I ran the word2id.py file. The function get_id() use the index 1 for unknown word. However, we use index 0 for unknown word when build the sr_word2id variable . Is there something wrong?
@Irvinglove Thanks for your question and reminders. I have made some mistakes in the previous code, though it won't affect the normal running. Now I have corrected it, please check it out.
In these codes, I used two special symbols, '\<PAD>' and '\<UNK>'. The index for '\<PAD>' is 0, and for '\<UNK>' is 1. You can find these in embed2ndarray.py.
For each special word(the same for char), I add an embedding vector(1 x 256) to the front of the W_embedding. I add this code in embed2ndarray.py file.
Dear yongye: Thank you for providing such excellent code. I encountered a problem when I ran the word2id.py file. The function get_id() use the index 1 for unknown word. However, we use index 0 for unknown word when build the sr_word2id variable . Is there something wrong?