Open meanmee opened 8 years ago
Other approaches like [Show and Tell] use a We matrix for word embedding which optimize the We , But in neuraltalk I found that it direcly optimize the Ws in which each raw represent a word. So Why do this way? or which way performs better?
Other approaches like [Show and Tell] use a We matrix for word embedding which optimize the We , But in neuraltalk I found that it direcly optimize the Ws in which each raw represent a word. So Why do this way? or which way performs better?