ryankiros / visual-semantic-embedding

Implementation of the image-sentence embedding method described in "Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models"
Other
425 stars 126 forks source link

Bow model encodes sentence with an extra word #4

Closed xthan closed 6 years ago

xthan commented 8 years ago

Thanks for providing such nice and interesting code.

I noticed that when using bag-of-words encoder, the code still adding an 'end of sentence' word for matrix x in function encode_sentences:

        x = numpy.zeros((k+1, len(caption))).astype('int64')
        x_mask = numpy.zeros((k+1, len(caption))).astype('float32')
        for idx, s in enumerate(seqs):
            x[:k,idx] = s
            x_mask[:k+1,idx] = 1.

While in function build_sentence_encoder:

# Word embedding
emb = tparams['Wemb'][x.flatten()].reshape([n_timesteps, n_samples, options['dim_word']])

# Encode sentences
if options['encoder'] == 'bow':
    sents = (emb * mask[:,:,None]).sum(0)

will encodes an sentence with an extra word which corresponds to the first row of tparams['Wemb'], which I think should be ignored.

Also, in encode_sentences function:

        seqs = []
        for i, cc in enumerate(caption):
            seqs.append([model['worddict'][w] if d[w] > 0 and model['worddict'][w] < model['options']['n_words'] else 1 for w in cc])

I think the line inside the for loop should be changed to:

seqs.append([model['worddict'][w] - 2 if d[w] > 0 and model['worddict'][w] < model['options']['n_words']

since the two words '' and 'UNK' are not in params['Wemb']

However, these did not effect the final performance a lot.

xthan commented 8 years ago

Maybe two rows in params['Wemb'] should be added corresponding to '' and 'UNK' by setting n_words = len(worddict) + 2?