nyu-mll / PRPN-Analysis

This repo contains the analysis results reported in the paper "Grammar Induction with Neural Language Models: An Unusual Replication"
MIT License
47 stars 7 forks source link

Almost constant gates values. #1

Open mojesty opened 6 years ago

mojesty commented 6 years ago

Hello. I use python2.7 and PyTorch 0.3.1/0.3.0, when I run the following code:

sens = 'i like orange apples .'
words = sens.strip().split()
x = numpy.array([corpus.dictionary[w] for w in words])
input = Variable(torch.LongTensor(x[:, None]))

hidden = model.init_hidden(1)
_, hidden = model(input, hidden)

gates = model.gates.squeeze().data.numpy()

i receive the wollowing gates activations: array([0.07, 0.07, 0.07, 0.07, 0.06], dtype=float32) so the parse tree is built incorrectly. The inputs look like this: array([3277, 262, 35, 3339, 11]) It works the same on different sentences/different versions of PyTorch (I tried 0.3.0 and 0.3.1, on 0.4.1 code fails to work because of changed BatchNorm1D ), and always most of gates has value 0.07 and parse trees look like they are incorrect. Please explain to me what am i doing wrong.

phu-pmh commented 6 years ago

Are you training the model on your own? Which version of the model is this? In any case, the following is my result using our best model:

Input array: array([ 8336,  6795, 18385,  3176,   373])
Output gates: array([ 0.27,  0.13,  0.02,  0.13,  0.17], dtype=float32))  
Tree: ( i ( ( ( like orange ) apples ) . ) )

It seems like we are using different dictionary for input tokens. I did not make any modification to the code of Shen et al 2018 so I'm afraid I have little idea what could be wrong after some modifications to the original code.

mojesty commented 6 years ago

I used your pretrained model and I simply ran the code that build vocabulary corpus = data.Corpus(args.data). I did not make any modifications to the original code, simply ran it. Which version of PyTorch do you have?