sherjilozair / char-rnn-tensorflow

Multi-layer Recurrent Neural Networks (LSTM, RNN) for character-level language models in Python using Tensorflow
MIT License
2.64k stars 960 forks source link

Char sequence probability #42

Open nournia opened 8 years ago

nournia commented 8 years ago

Hi, I'm using char-rnn for computing sentence probability, which is main functionality of language modeling. This piece of code feeds sentence chars one by one and finds out probability of correctly predicting next char:

state = self.cell.zero_state(1, tf.float32).eval(session=session)
char_probas = []
input = np.zeros((1, 1))
for c, char in enumerate(sentence[:-1]):
    input[0, 0] = vocab[char]
    feed = {self.input_data: input, self.initial_state: state}
    [probs, state] = session.run([self.probs, self.final_state], feed)
    char_probas.append(probs[0][vocab[sentence[c+1]]])
probability = np.mean(char_probas)

It works fine and prefers well written sentences from sick ones. But I think it's not optimized for performance. Is it possible to feed one sequence of chars and receive generation probabilities for each one of them given previous chars? Currently, it seems that data transfer between host and device is a major bottleneck.

pranjaldaga commented 8 years ago

@nournia Does 'log_probability' here give you the sentence probability?

nournia commented 8 years ago

I've updated that piece of code and removed log from probability computation.

pranjaldaga commented 8 years ago

It returns 'nan' when I try this piece of code for any sentence. Do you get the floating point probabilities with this @nournia ?

nournia commented 8 years ago

It works for me. Are you getting proper results from sample.py script?

spectrometerHBH commented 6 years ago

I got the error 'tuple' object has no attribute 'eval' when applying your code. It's 'state = self.cell.zero_state(1, tf.float32).eval(session=session)' that go wrong. Can you help me with handling this?

abhirut commented 5 years ago

Why is probability the mean of the character probabilities? Shouldn't it be the product?