nicolas-ivanov / debug_seq2seq

[unmaintained] Make seq2seq for keras work
233 stars 86 forks source link

Process with end-of-sentence symbol #19

Open tungpv85 opened 7 years ago

tungpv85 commented 7 years ago

Hi guys. I'm running Nicolas code and I have some concerns about the end-of-sentence symbol, i.e. "$$$" What I understand is Nicolas put "$$$" at the end of each sentence. So my questions are: 1) If I understand correctly, In Nicolas code, seem that the "$$$" is treated similar to normal word ? That mean, word2vec model will generate a vector for "$$$" like other words. Does the symbol important, i.e. can I remove it from each sentence without affecting the performance? 2) If the symbol important, do you think how can I add the symbol to the speech data where each "sentence" is a sequence of feature vectors ? Do I just define a fix arbitrary vector as "$$$" and then add to the end of each "sentence" ?

Thank you in advance