karpathy / char-rnn

Multi-layer Recurrent Neural Networks (LSTM, GRU, RNN) for character-level language models in Torch
11.59k stars 2.58k forks source link

Question: Single line seq_length? #165

Open drohack opened 8 years ago

drohack commented 8 years ago

I'm attempting to run this code over a json formatted dataset. And each line is a new json object. Would it be possible to run the code so that the seq_length only looks at the current line?

Each json object is roughly the same format and the training has done a good job of understanding that format, but I don't need a given object's data to be dependent on the previous object.

For example: {"name":"red","type":"color","text":"The color red is hot to the touch."} {"name":"pipe","type":"object","text":"A long tube made out of metal."}

If I understand seq_length correctly if I set it to 150 it would look at both of these lines when back propagating. My actual data has each object consisting of lines between 100 - 500 characters long, which my GPU can handle a seq_length of up to 500 depending on batch size. But I don't need the "pipe" object to be dependent on the "red" object.

I'm not too familiar on how a given batch is created (or possible mini-batches?) and if this would effect anything. Or if having it line delimited that it would screw up the training on json object structure.

HaraldKorneliussen commented 8 years ago

char-rnn has no feature like that at present, but with the same use case I've just shuffled the lines in the file (with the command line tool shuf). Since there is no correlation across lines then, the network should learn to effectively drop the accumulated state on encountering a newline.

AlekzNet commented 8 years ago

BTW, what will be the correct solution? Reset/clear model state on each newline?