deepgram / kur

Descriptive Deep Learning
Apache License 2.0
814 stars 107 forks source link

Problem with Sentence generation #61

Closed nakulcr7 closed 7 years ago

nakulcr7 commented 7 years ago

Hi,

I'm new to Kur and I'm trying to use it for Sentence Generation. Here's my flow:

1. Input data

# Imports
from keras.utils import np_utils
import numpy
import pickle

# Load data
filename = "shakespeare.txt"
raw_text = open(filename).read()
raw_text = raw_text.lower()

# create mapping of unique chars to integers
chars = sorted(list(set(raw_text)))
char_to_int = dict((c, i) for i, c in enumerate(chars))

# summarize the loaded data
n_chars = len(raw_text)
n_vocab = len(chars)
print "Total Characters: ", n_chars
print "Total Vocab: ", n_vocab

# prepare the dataset of input to output pairs encoded as integers
seq_length = 100
dataX = []
dataY = []
for i in range(0, n_chars - seq_length, 1):
    seq_in = raw_text[i:i + seq_length]
    seq_out = raw_text[i + seq_length]
    dataX.append([char_to_int[char] for char in seq_in])
    dataY.append(char_to_int[seq_out])
n_patterns = len(dataX)
print "Total Patterns: ", n_patterns

# reshape X to be [samples, time steps, features]
X = numpy.reshape(dataX, (n_patterns, seq_length, 1))
# normalize
X = X / float(n_vocab)
# one hot encode the output variable
y = np_utils.to_categorical(dataY)

# Data to file
output_file = 'shakespeare_data'
with open(output_file, 'wb') as fh:
    fh.write(pickle.dumps({'in' : X, 'out' : y}))

# Test out pickle data
with open('shakespeare_data', 'rb') as fh:
    data = pickle.loads(fh.read())
print data.keys()

# in
print 'IN'
print data['in'][:1], type(data['in'])

# out
print 'OUT'
print data['out'][:1], type(data['out'])

print 'Input shape: {} {}'.format(X.shape[1], X.shape[2])
print 'Output shape: {}'.format(y.shape[1])

2. Kurfile

---
model:
  - input:
      shape: [100, 1]
    name: in
  - recurrent:
      size: 256
      type: lstm
  - dropout: 0.2
  - dense: 65
  - activation: softmax
    name: out

train:
  data: 
    - pickle: shakespeare_data
  epochs: 10
  log: shakespeare-log
  optimizer: adam

loss:
  - target: out
    name: categorical_crossentropy
...

3. Train with optimizer

kur train kurfile.yml

kur -vv train shakespeare.yml
[INFO 2017-04-09 19:35:44,036 kur.kurfile:754] Parsing source: shakespeare.yml, included by top-level.
[INFO 2017-04-09 19:35:44,040 kur.kurfile:87] Parsing Kurfile...
[DEBUG 2017-04-09 19:35:44,041 kur.kurfile:905] Parsing Kurfile section: settings
[DEBUG 2017-04-09 19:35:44,041 kur.kurfile:905] Parsing Kurfile section: train
[DEBUG 2017-04-09 19:35:44,046 kur.kurfile:905] Parsing Kurfile section: validate
[DEBUG 2017-04-09 19:35:44,046 kur.kurfile:905] Parsing Kurfile section: test
[DEBUG 2017-04-09 19:35:44,046 kur.kurfile:905] Parsing Kurfile section: evaluate
[DEBUG 2017-04-09 19:35:44,048 kur.kurfile:905] Parsing Kurfile section: loss
[INFO 2017-04-09 19:35:44,050 kur.loggers.binary_logger:71] Loading log data: shakespeare-log
[DEBUG 2017-04-09 19:35:44,050 kur.loggers.binary_logger:78] Loading old-style binary logger.
[DEBUG 2017-04-09 19:35:44,050 kur.loggers.binary_logger:187] Loading binary column: training_loss_total
[DEBUG 2017-04-09 19:35:44,050 kur.loggers.binary_logger:195] No such log column exists: shakespeare-log/training_loss_total
[DEBUG 2017-04-09 19:35:44,050 kur.loggers.binary_logger:187] Loading binary column: training_loss_batch
[DEBUG 2017-04-09 19:35:44,050 kur.loggers.binary_logger:195] No such log column exists: shakespeare-log/training_loss_batch
[DEBUG 2017-04-09 19:35:44,050 kur.loggers.binary_logger:187] Loading binary column: training_loss_time
[DEBUG 2017-04-09 19:35:44,051 kur.loggers.binary_logger:195] No such log column exists: shakespeare-log/training_loss_time
[DEBUG 2017-04-09 19:35:44,051 kur.loggers.binary_logger:187] Loading binary column: validation_loss_total
[DEBUG 2017-04-09 19:35:44,051 kur.loggers.binary_logger:195] No such log column exists: shakespeare-log/validation_loss_total
[DEBUG 2017-04-09 19:35:44,051 kur.loggers.binary_logger:187] Loading binary column: validation_loss_batch
[DEBUG 2017-04-09 19:35:44,051 kur.loggers.binary_logger:195] No such log column exists: shakespeare-log/validation_loss_batch
[DEBUG 2017-04-09 19:35:44,051 kur.loggers.binary_logger:187] Loading binary column: validation_loss_time
[DEBUG 2017-04-09 19:35:44,051 kur.loggers.binary_logger:195] No such log column exists: shakespeare-log/validation_loss_time
[WARNING 2017-04-09 19:35:44,686 kur.supplier.pickle_supplier:67] We needed to explicitly set a "latin1" encoding to properly load the pickled data. This is probably because the pickled data was created in Python 2. You really should switch over to Python 3 in order to ensure future compatibility.
[DEBUG 2017-04-09 19:35:44,706 kur.providers.batch_provider:57] Batch size set to: 32
[DEBUG 2017-04-09 19:35:44,707 kur.backend.backend:270] Using backend: keras
[DEBUG 2017-04-09 19:35:44,707 kur.backend.backend:69] No execution device indicated to backend. Checking available devices...
[DEBUG 2017-04-09 19:35:44,707 kur.utils.cuda:161] Loading NVIDIA ML library.
[DEBUG 2017-04-09 19:35:44,708 kur.backend.backend:75] Failed to initialize CUDA. Falling back to CPU.
[INFO 2017-04-09 19:35:44,708 kur.backend.backend:154] Creating backend: keras
[INFO 2017-04-09 19:35:44,708 kur.backend.backend:157] Backend variants: none
[INFO 2017-04-09 19:35:44,708 kur.backend.keras_backend:124] No particular backend for Keras has been requested.
[DEBUG 2017-04-09 19:35:44,709 kur.backend.keras_backend:126] Using the system-default Keras backend.
[INFO 2017-04-09 19:35:44,709 kur.backend.keras_backend:175] Requesting CPU
[DEBUG 2017-04-09 19:35:44,710 kur.backend.keras_backend:186] Overriding environmental variables: {'KERAS_BACKEND': None, 'THEANO_FLAGS': 'force_device=true,device=cpu', 'CUDA_VISIBLE_DEVICES': '100', 'TF_CPP_MIN_LOG_LEVEL': '1'}
[INFO 2017-04-09 19:35:46,222 kur.backend.keras_backend:192] Keras is loaded. The backend is: tensorflow
[INFO 2017-04-09 19:35:46,223 kur.model.model:261] Enumerating the model containers.
[INFO 2017-04-09 19:35:46,223 kur.model.model:266] Assembling the model dependency graph.
[DEBUG 2017-04-09 19:35:46,223 kur.model.model:273] Assembled Node: in
[DEBUG 2017-04-09 19:35:46,223 kur.model.model:275]   Uses:
[DEBUG 2017-04-09 19:35:46,223 kur.model.model:277]   Used by: ..recurrent.0
[DEBUG 2017-04-09 19:35:46,223 kur.model.model:278]   Aliases: in
[DEBUG 2017-04-09 19:35:46,223 kur.model.model:273] Assembled Node: ..recurrent.0
[DEBUG 2017-04-09 19:35:46,223 kur.model.model:275]   Uses: in
[DEBUG 2017-04-09 19:35:46,224 kur.model.model:277]   Used by: ..dropout.0
[DEBUG 2017-04-09 19:35:46,224 kur.model.model:278]   Aliases: ..recurrent.0
[DEBUG 2017-04-09 19:35:46,224 kur.model.model:273] Assembled Node: ..dropout.0
[DEBUG 2017-04-09 19:35:46,224 kur.model.model:275]   Uses: ..recurrent.0
[DEBUG 2017-04-09 19:35:46,224 kur.model.model:277]   Used by: ..dense.0
[DEBUG 2017-04-09 19:35:46,224 kur.model.model:278]   Aliases: ..dropout.0
[DEBUG 2017-04-09 19:35:46,224 kur.model.model:273] Assembled Node: ..dense.0
[DEBUG 2017-04-09 19:35:46,224 kur.model.model:275]   Uses: ..dropout.0
[DEBUG 2017-04-09 19:35:46,224 kur.model.model:277]   Used by: out
[DEBUG 2017-04-09 19:35:46,225 kur.model.model:278]   Aliases: ..dense.0
[DEBUG 2017-04-09 19:35:46,225 kur.model.model:273] Assembled Node: out
[DEBUG 2017-04-09 19:35:46,225 kur.model.model:275]   Uses: ..dense.0
[DEBUG 2017-04-09 19:35:46,225 kur.model.model:277]   Used by:
[DEBUG 2017-04-09 19:35:46,225 kur.model.model:278]   Aliases: out
[INFO 2017-04-09 19:35:46,225 kur.model.model:281] Connecting the model graph.
[DEBUG 2017-04-09 19:35:46,226 kur.model.model:312] Building node: in
[DEBUG 2017-04-09 19:35:46,226 kur.model.model:313]   Aliases: in
[DEBUG 2017-04-09 19:35:46,226 kur.model.model:314]   Inputs:
[DEBUG 2017-04-09 19:35:46,226 kur.containers.layers.placeholder:161] Creating placeholder for "in" with data type "float32".
[DEBUG 2017-04-09 19:35:46,226 kur.model.model:126] Trying to infer shape for input "in"
[DEBUG 2017-04-09 19:35:46,226 kur.model.model:144] Inferred shape for input "in": (100, 1)
[DEBUG 2017-04-09 19:35:46,249 kur.model.model:394]   Value: Tensor("in:0", shape=(?, 100, 1), dtype=float32)
[DEBUG 2017-04-09 19:35:46,250 kur.model.model:312] Building node: ..recurrent.0
[DEBUG 2017-04-09 19:35:46,250 kur.model.model:313]   Aliases: ..recurrent.0
[DEBUG 2017-04-09 19:35:46,250 kur.model.model:314]   Inputs:
[DEBUG 2017-04-09 19:35:46,250 kur.model.model:316]   - in: Tensor("in:0", shape=(?, 100, 1), dtype=float32)
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
[DEBUG 2017-04-09 19:35:46,713 kur.model.model:394]   Value: Tensor("..recurrent.0/transpose_1:0", shape=(?, ?, 256), dtype=float32)
[DEBUG 2017-04-09 19:35:46,713 kur.model.model:312] Building node: ..dropout.0
[DEBUG 2017-04-09 19:35:46,713 kur.model.model:313]   Aliases: ..dropout.0
[DEBUG 2017-04-09 19:35:46,713 kur.model.model:314]   Inputs:
[DEBUG 2017-04-09 19:35:46,713 kur.model.model:316]   - ..recurrent.0: Tensor("..recurrent.0/transpose_1:0", shape=(?, ?, 256), dtype=float32)
[DEBUG 2017-04-09 19:35:46,736 kur.model.model:394]   Value: Tensor("..dropout.0/cond/Merge:0", shape=(?, ?, 256), dtype=float32)
[DEBUG 2017-04-09 19:35:46,736 kur.model.model:312] Building node: ..dense.0
[DEBUG 2017-04-09 19:35:46,736 kur.model.model:313]   Aliases: ..dense.0
[DEBUG 2017-04-09 19:35:46,736 kur.model.model:314]   Inputs:
[DEBUG 2017-04-09 19:35:46,736 kur.model.model:316]   - ..dropout.0: Tensor("..dropout.0/cond/Merge:0", shape=(?, ?, 256), dtype=float32)
[DEBUG 2017-04-09 19:35:46,786 kur.model.model:394]   Value: Tensor("..dense.0/add:0", shape=(?, 100, 65), dtype=float32)
[DEBUG 2017-04-09 19:35:46,786 kur.model.model:312] Building node: out
[DEBUG 2017-04-09 19:35:46,786 kur.model.model:313]   Aliases: out
[DEBUG 2017-04-09 19:35:46,786 kur.model.model:314]   Inputs:
[DEBUG 2017-04-09 19:35:46,786 kur.model.model:316]   - ..dense.0: Tensor("..dense.0/add:0", shape=(?, 100, 65), dtype=float32)
[DEBUG 2017-04-09 19:35:46,800 kur.model.model:394]   Value: Tensor("out/truediv:0", shape=(?, 100, 65), dtype=float32)
[INFO 2017-04-09 19:35:46,800 kur.model.model:285] Model inputs:  in
[INFO 2017-04-09 19:35:46,800 kur.model.model:286] Model outputs: out
Traceback (most recent call last):
  File "/Users/nakul/.virtualenvs/kur/bin/kur", line 11, in <module>
    sys.exit(main())
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/__main__.py", line 385, in main
    sys.exit(args.func(args) or 0)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/__main__.py", line 62, in train
    func = spec.get_training_function()
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/kurfile.py", line 373, in get_training_function
    trainer = self.get_trainer()
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/kurfile.py", line 502, in get_trainer
    optimizer=self.get_optimizer() if with_optimizer else None
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/kurfile.py", line 517, in get_optimizer
    spec = dict(self.data['train'].get('optimizer', {}))
ValueError: dictionary update sequence element #0 has length 1; 2 is required

4. Train without optimizer

kur train kurfile.yml

[INFO 2017-04-09 21:13:32,555 kur.kurfile:754] Parsing source: shakespeare.yml, included by top-level.
[INFO 2017-04-09 21:13:32,560 kur.kurfile:87] Parsing Kurfile...
[DEBUG 2017-04-09 21:13:32,560 kur.kurfile:905] Parsing Kurfile section: settings
[DEBUG 2017-04-09 21:13:32,560 kur.kurfile:905] Parsing Kurfile section: train
[DEBUG 2017-04-09 21:13:32,564 kur.kurfile:905] Parsing Kurfile section: validate
[DEBUG 2017-04-09 21:13:32,564 kur.kurfile:905] Parsing Kurfile section: test
[DEBUG 2017-04-09 21:13:32,564 kur.kurfile:905] Parsing Kurfile section: evaluate
[DEBUG 2017-04-09 21:13:32,567 kur.kurfile:905] Parsing Kurfile section: loss
[INFO 2017-04-09 21:13:32,568 kur.loggers.binary_logger:71] Loading log data: shakespeare-log
[DEBUG 2017-04-09 21:13:32,568 kur.loggers.binary_logger:78] Loading old-style binary logger.
[DEBUG 2017-04-09 21:13:32,568 kur.loggers.binary_logger:187] Loading binary column: training_loss_total
[DEBUG 2017-04-09 21:13:32,568 kur.loggers.binary_logger:195] No such log column exists: shakespeare-log/training_loss_total
[DEBUG 2017-04-09 21:13:32,568 kur.loggers.binary_logger:187] Loading binary column: training_loss_batch
[DEBUG 2017-04-09 21:13:32,568 kur.loggers.binary_logger:195] No such log column exists: shakespeare-log/training_loss_batch
[DEBUG 2017-04-09 21:13:32,568 kur.loggers.binary_logger:187] Loading binary column: training_loss_time
[DEBUG 2017-04-09 21:13:32,568 kur.loggers.binary_logger:195] No such log column exists: shakespeare-log/training_loss_time
[DEBUG 2017-04-09 21:13:32,568 kur.loggers.binary_logger:187] Loading binary column: validation_loss_total
[DEBUG 2017-04-09 21:13:32,568 kur.loggers.binary_logger:195] No such log column exists: shakespeare-log/validation_loss_total
[DEBUG 2017-04-09 21:13:32,569 kur.loggers.binary_logger:187] Loading binary column: validation_loss_batch
[DEBUG 2017-04-09 21:13:32,569 kur.loggers.binary_logger:195] No such log column exists: shakespeare-log/validation_loss_batch
[DEBUG 2017-04-09 21:13:32,569 kur.loggers.binary_logger:187] Loading binary column: validation_loss_time
[DEBUG 2017-04-09 21:13:32,569 kur.loggers.binary_logger:195] No such log column exists: shakespeare-log/validation_loss_time
[WARNING 2017-04-09 21:13:33,539 kur.supplier.pickle_supplier:67] We needed to explicitly set a "latin1" encoding to properly load the pickled data. This is probably because the pickled data was created in Python 2. You really should switch over to Python 3 in order to ensure future compatibility.
[DEBUG 2017-04-09 21:13:33,552 kur.providers.batch_provider:57] Batch size set to: 32
[DEBUG 2017-04-09 21:13:33,554 kur.backend.backend:270] Using backend: keras
[DEBUG 2017-04-09 21:13:33,555 kur.backend.backend:69] No execution device indicated to backend. Checking available devices...
[DEBUG 2017-04-09 21:13:33,555 kur.utils.cuda:161] Loading NVIDIA ML library.
[DEBUG 2017-04-09 21:13:33,555 kur.backend.backend:75] Failed to initialize CUDA. Falling back to CPU.
[INFO 2017-04-09 21:13:33,555 kur.backend.backend:154] Creating backend: keras
[INFO 2017-04-09 21:13:33,555 kur.backend.backend:157] Backend variants: none
[INFO 2017-04-09 21:13:33,555 kur.backend.keras_backend:124] No particular backend for Keras has been requested.
[DEBUG 2017-04-09 21:13:33,556 kur.backend.keras_backend:126] Using the system-default Keras backend.
[INFO 2017-04-09 21:13:33,556 kur.backend.keras_backend:175] Requesting CPU
[DEBUG 2017-04-09 21:13:33,556 kur.backend.keras_backend:186] Overriding environmental variables: {'KERAS_BACKEND': None, 'THEANO_FLAGS': 'force_device=true,device=cpu', 'CUDA_VISIBLE_DEVICES': '100', 'TF_CPP_MIN_LOG_LEVEL': '1'}
[INFO 2017-04-09 21:13:35,016 kur.backend.keras_backend:192] Keras is loaded. The backend is: tensorflow
[INFO 2017-04-09 21:13:35,016 kur.model.model:261] Enumerating the model containers.
[INFO 2017-04-09 21:13:35,016 kur.model.model:266] Assembling the model dependency graph.
[DEBUG 2017-04-09 21:13:35,017 kur.model.model:273] Assembled Node: in
[DEBUG 2017-04-09 21:13:35,017 kur.model.model:275]   Uses:
[DEBUG 2017-04-09 21:13:35,017 kur.model.model:277]   Used by: ..recurrent.0
[DEBUG 2017-04-09 21:13:35,017 kur.model.model:278]   Aliases: in
[DEBUG 2017-04-09 21:13:35,017 kur.model.model:273] Assembled Node: ..recurrent.0
[DEBUG 2017-04-09 21:13:35,017 kur.model.model:275]   Uses: in
[DEBUG 2017-04-09 21:13:35,017 kur.model.model:277]   Used by: ..dropout.0
[DEBUG 2017-04-09 21:13:35,018 kur.model.model:278]   Aliases: ..recurrent.0
[DEBUG 2017-04-09 21:13:35,018 kur.model.model:273] Assembled Node: ..dropout.0
[DEBUG 2017-04-09 21:13:35,018 kur.model.model:275]   Uses: ..recurrent.0
[DEBUG 2017-04-09 21:13:35,018 kur.model.model:277]   Used by: ..dense.0
[DEBUG 2017-04-09 21:13:35,018 kur.model.model:278]   Aliases: ..dropout.0
[DEBUG 2017-04-09 21:13:35,018 kur.model.model:273] Assembled Node: ..dense.0
[DEBUG 2017-04-09 21:13:35,018 kur.model.model:275]   Uses: ..dropout.0
[DEBUG 2017-04-09 21:13:35,018 kur.model.model:277]   Used by: out
[DEBUG 2017-04-09 21:13:35,018 kur.model.model:278]   Aliases: ..dense.0
[DEBUG 2017-04-09 21:13:35,019 kur.model.model:273] Assembled Node: out
[DEBUG 2017-04-09 21:13:35,019 kur.model.model:275]   Uses: ..dense.0
[DEBUG 2017-04-09 21:13:35,019 kur.model.model:277]   Used by:
[DEBUG 2017-04-09 21:13:35,019 kur.model.model:278]   Aliases: out
[INFO 2017-04-09 21:13:35,019 kur.model.model:281] Connecting the model graph.
[DEBUG 2017-04-09 21:13:35,025 kur.model.model:312] Building node: in
[DEBUG 2017-04-09 21:13:35,026 kur.model.model:313]   Aliases: in
[DEBUG 2017-04-09 21:13:35,026 kur.model.model:314]   Inputs:
[DEBUG 2017-04-09 21:13:35,026 kur.containers.layers.placeholder:161] Creating placeholder for "in" with data type "float32".
[DEBUG 2017-04-09 21:13:35,026 kur.model.model:126] Trying to infer shape for input "in"
[DEBUG 2017-04-09 21:13:35,026 kur.model.model:144] Inferred shape for input "in": (100, 1)
[DEBUG 2017-04-09 21:13:35,047 kur.model.model:394]   Value: Tensor("in:0", shape=(?, 100, 1), dtype=float32)
[DEBUG 2017-04-09 21:13:35,047 kur.model.model:312] Building node: ..recurrent.0
[DEBUG 2017-04-09 21:13:35,047 kur.model.model:313]   Aliases: ..recurrent.0
[DEBUG 2017-04-09 21:13:35,048 kur.model.model:314]   Inputs:
[DEBUG 2017-04-09 21:13:35,048 kur.model.model:316]   - in: Tensor("in:0", shape=(?, 100, 1), dtype=float32)
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
[DEBUG 2017-04-09 21:13:35,497 kur.model.model:394]   Value: Tensor("..recurrent.0/transpose_1:0", shape=(?, ?, 256), dtype=float32)
[DEBUG 2017-04-09 21:13:35,497 kur.model.model:312] Building node: ..dropout.0
[DEBUG 2017-04-09 21:13:35,497 kur.model.model:313]   Aliases: ..dropout.0
[DEBUG 2017-04-09 21:13:35,497 kur.model.model:314]   Inputs:
[DEBUG 2017-04-09 21:13:35,498 kur.model.model:316]   - ..recurrent.0: Tensor("..recurrent.0/transpose_1:0", shape=(?, ?, 256), dtype=float32)
[DEBUG 2017-04-09 21:13:35,518 kur.model.model:394]   Value: Tensor("..dropout.0/cond/Merge:0", shape=(?, ?, 256), dtype=float32)
[DEBUG 2017-04-09 21:13:35,518 kur.model.model:312] Building node: ..dense.0
[DEBUG 2017-04-09 21:13:35,518 kur.model.model:313]   Aliases: ..dense.0
[DEBUG 2017-04-09 21:13:35,518 kur.model.model:314]   Inputs:
[DEBUG 2017-04-09 21:13:35,518 kur.model.model:316]   - ..dropout.0: Tensor("..dropout.0/cond/Merge:0", shape=(?, ?, 256), dtype=float32)
[DEBUG 2017-04-09 21:13:35,551 kur.model.model:394]   Value: Tensor("..dense.0/add:0", shape=(?, 100, 65), dtype=float32)
[DEBUG 2017-04-09 21:13:35,551 kur.model.model:312] Building node: out
[DEBUG 2017-04-09 21:13:35,551 kur.model.model:313]   Aliases: out
[DEBUG 2017-04-09 21:13:35,551 kur.model.model:314]   Inputs:
[DEBUG 2017-04-09 21:13:35,552 kur.model.model:316]   - ..dense.0: Tensor("..dense.0/add:0", shape=(?, 100, 65), dtype=float32)
[DEBUG 2017-04-09 21:13:35,559 kur.model.model:394]   Value: Tensor("out/truediv:0", shape=(?, 100, 65), dtype=float32)
[INFO 2017-04-09 21:13:35,559 kur.model.model:285] Model inputs:  in
[INFO 2017-04-09 21:13:35,559 kur.model.model:286] Model outputs: out
[INFO 2017-04-09 21:13:35,559 kur.model.executor:591] No historical training loss available from logs.
[INFO 2017-04-09 21:13:35,560 kur.model.executor:599] No historical validation loss available from logs.
[INFO 2017-04-09 21:13:35,560 kur.model.executor:612] No previous epochs.
[DEBUG 2017-04-09 21:13:35,560 kur.model.executor:108] Recompiling the model.
[DEBUG 2017-04-09 21:13:35,560 kur.backend.keras_backend:592] Instantiating a Keras model.
[DEBUG 2017-04-09 21:13:35,561 kur.backend.keras_backend:603] _________________________________________________________________
[DEBUG 2017-04-09 21:13:35,561 kur.backend.keras_backend:603] Layer (type)                 Output Shape              Param #
[DEBUG 2017-04-09 21:13:35,561 kur.backend.keras_backend:603] =================================================================
[DEBUG 2017-04-09 21:13:35,561 kur.backend.keras_backend:603] in (InputLayer)              (None, 100, 1)            0
[DEBUG 2017-04-09 21:13:35,561 kur.backend.keras_backend:603] _________________________________________________________________
[DEBUG 2017-04-09 21:13:35,561 kur.backend.keras_backend:603] ..recurrent.0 (LSTM)         (None, 100, 256)          264192
[DEBUG 2017-04-09 21:13:35,561 kur.backend.keras_backend:603] _________________________________________________________________
[DEBUG 2017-04-09 21:13:35,561 kur.backend.keras_backend:603] ..dropout.0 (Dropout)        (None, 100, 256)          0
[DEBUG 2017-04-09 21:13:35,561 kur.backend.keras_backend:603] _________________________________________________________________
[DEBUG 2017-04-09 21:13:35,562 kur.backend.keras_backend:603] ..dense.0 (Dense)            (None, 100, 65)           16705
[DEBUG 2017-04-09 21:13:35,562 kur.backend.keras_backend:603] _________________________________________________________________
[DEBUG 2017-04-09 21:13:35,562 kur.backend.keras_backend:603] out (Activation)             (None, 100, 65)           0
[DEBUG 2017-04-09 21:13:35,562 kur.backend.keras_backend:603] =================================================================
[DEBUG 2017-04-09 21:13:35,562 kur.backend.keras_backend:603] Total params: 280,897.0
[DEBUG 2017-04-09 21:13:35,562 kur.backend.keras_backend:603] Trainable params: 280,897.0
[DEBUG 2017-04-09 21:13:35,562 kur.backend.keras_backend:603] Non-trainable params: 0.0
[DEBUG 2017-04-09 21:13:35,562 kur.backend.keras_backend:603] _________________________________________________________________
[DEBUG 2017-04-09 21:13:35,562 kur.backend.keras_backend:603]
[DEBUG 2017-04-09 21:13:35,562 kur.backend.keras_backend:645] Assembling a training function from the model.
[DEBUG 2017-04-09 21:13:35,580 kur.backend.keras_backend:575] Adding additional inputs: out
[DEBUG 2017-04-09 21:13:37,026 kur.backend.keras_backend:668] Additional inputs for log functions: out
[DEBUG 2017-04-09 21:13:37,026 kur.backend.keras_backend:685] Expected input shapes: in=(None, 100, 1), out=(None, None, None)
[DEBUG 2017-04-09 21:13:37,027 kur.backend.keras_backend:703] Compiled model: {'func': <keras.backend.tensorflow_backend.Function object at 0x11146de48>, 'names': {'input': ['in', 'out'], 'output': ['out', 'out']}, 'shapes': {'input': [(None, 100, 1), (None, None, None)]}}
[INFO 2017-04-09 21:13:37,754 kur.backend.keras_backend:736] Waiting for model to finish compiling...
[DEBUG 2017-04-09 21:13:37,754 kur.providers.batch_provider:139] Preparing next batch of data...
[DEBUG 2017-04-09 21:13:37,754 kur.providers.batch_provider:204] Next batch of data has been prepared.
[ERROR 2017-04-09 21:13:38,016 kur.model.executor:293] Exception raised during training.
Traceback (most recent call last):
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1022, in _do_call
    return fn(*args)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1004, in _run_fn
    status, run_metadata)
  File "/Users/nakul/.pyenv/versions/3.6.1/lib/python3.6/contextlib.py", line 89, in __exit__
    next(self.gen)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 469, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [2,65,1] vs. [2,100,65]
     [[Node: gradients/mul_grad/BroadcastGradientArgs = BroadcastGradientArgs[T=DT_INT32, _class=["loc:@mul"], _device="/job:localhost/replica:0/task:0/cpu:0"](gradients/mul_grad/Shape, gradients/mul_grad/Shape_1)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/model/executor.py", line 290, in train
    **kwargs
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/model/executor.py", line 725, in wrapped_train
    self.compile('train', with_provider=provider)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/model/executor.py", line 114, in compile
    **kwargs
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/backend/keras_backend.py", line 708, in compile
    self.wait_for_compile(model, key)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/backend/keras_backend.py", line 738, in wait_for_compile
    self.run_batch(model, batch, key, False)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/backend/keras_backend.py", line 780, in run_batch
    outputs = compiled['func'](inputs)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2073, in __call__
    feed_dict=feed_dict)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 767, in run
    run_metadata_ptr)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 965, in _run
    feed_dict_string, options, run_metadata)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1015, in _do_run
    target_list, options, run_metadata)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1035, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [2,65,1] vs. [2,100,65]
     [[Node: gradients/mul_grad/BroadcastGradientArgs = BroadcastGradientArgs[T=DT_INT32, _class=["loc:@mul"], _device="/job:localhost/replica:0/task:0/cpu:0"](gradients/mul_grad/Shape, gradients/mul_grad/Shape_1)]]

Caused by op 'gradients/mul_grad/BroadcastGradientArgs', defined at:
  File "/Users/nakul/.virtualenvs/kur/bin/kur", line 11, in <module>
    sys.exit(main())
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/__main__.py", line 385, in main
    sys.exit(args.func(args) or 0)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/__main__.py", line 63, in train
    func(step=args.step)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/kurfile.py", line 414, in func
    return trainer.train(**defaults)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/model/executor.py", line 290, in train
    **kwargs
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/model/executor.py", line 725, in wrapped_train
    self.compile('train', with_provider=provider)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/model/executor.py", line 114, in compile
    **kwargs
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/backend/keras_backend.py", line 653, in compile
    compiled.trainable_weights, total_loss
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/optimizer/optimizer.py", line 47, in optimize
    return keras_optimizer.get_updates(weights, [], loss)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/keras/optimizers.py", line 381, in get_updates
    grads = self.get_gradients(loss, params)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/keras/optimizers.py", line 47, in get_gradients
    grads = K.gradients(loss, params)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2108, in gradients
    return tf.gradients(loss, variables, colocate_gradients_with_ops=True)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/ops/gradients_impl.py", line 482, in gradients
    in_grads = grad_fn(op, *out_grads)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/ops/math_grad.py", line 610, in _MulGrad
    rx, ry = gen_array_ops._broadcast_gradient_args(sx, sy)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 411, in _broadcast_gradient_args
    name=name)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op
    op_def=op_def)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2395, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1264, in __init__
    self._traceback = _extract_stack()

...which was originally created as op 'mul', defined at:
  File "/Users/nakul/.virtualenvs/kur/bin/kur", line 11, in <module>
    sys.exit(main())
[elided 5 identical lines from previous traceback]
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/model/executor.py", line 114, in compile
    **kwargs
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/backend/keras_backend.py", line 650, in compile
    self.process_loss(model, loss)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/backend/keras_backend.py", line 566, in process_loss
    self.find_compiled_layer_by_name(model, target)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/loss/categorical_crossentropy.py", line 36, in get_loss
    return keras_wrap(model, target, output, 'categorical_crossentropy')
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/loss/loss.py", line 37, in keras_wrap
    out = loss(ins[0][1], output)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/keras/losses.py", line 37, in categorical_crossentropy
    return K.categorical_crossentropy(y_pred, y_true)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2552, in categorical_crossentropy
    return - tf.reduce_sum(target * tf.log(output),
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 884, in binary_op_wrapper
    return func(x, y, name=name)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 1105, in _mul_dispatch
    return gen_math_ops._mul(x, y, name=name)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/ops/gen_math_ops.py", line 1625, in _mul
    result = _op_def_lib.apply_op("Mul", x=x, y=y, name=name)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op
    op_def=op_def)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2395, in create_op
    original_op=self._default_original_op, op_def=op_def)

InvalidArgumentError (see above for traceback): Incompatible shapes: [2,65,1] vs. [2,100,65]
     [[Node: gradients/mul_grad/BroadcastGradientArgs = BroadcastGradientArgs[T=DT_INT32, _class=["loc:@mul"], _device="/job:localhost/replica:0/task:0/cpu:0"](gradients/mul_grad/Shape, gradients/mul_grad/Shape_1)]]

Traceback (most recent call last):
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1022, in _do_call
    return fn(*args)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1004, in _run_fn
    status, run_metadata)
  File "/Users/nakul/.pyenv/versions/3.6.1/lib/python3.6/contextlib.py", line 89, in __exit__
    next(self.gen)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 469, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [2,65,1] vs. [2,100,65]
     [[Node: gradients/mul_grad/BroadcastGradientArgs = BroadcastGradientArgs[T=DT_INT32, _class=["loc:@mul"], _device="/job:localhost/replica:0/task:0/cpu:0"](gradients/mul_grad/Shape, gradients/mul_grad/Shape_1)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/nakul/.virtualenvs/kur/bin/kur", line 11, in <module>
    sys.exit(main())
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/__main__.py", line 385, in main
    sys.exit(args.func(args) or 0)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/__main__.py", line 63, in train
    func(step=args.step)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/kurfile.py", line 414, in func
    return trainer.train(**defaults)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/model/executor.py", line 290, in train
    **kwargs
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/model/executor.py", line 725, in wrapped_train
    self.compile('train', with_provider=provider)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/model/executor.py", line 114, in compile
    **kwargs
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/backend/keras_backend.py", line 708, in compile
    self.wait_for_compile(model, key)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/backend/keras_backend.py", line 738, in wait_for_compile
    self.run_batch(model, batch, key, False)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/backend/keras_backend.py", line 780, in run_batch
    outputs = compiled['func'](inputs)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2073, in __call__
    feed_dict=feed_dict)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 767, in run
    run_metadata_ptr)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 965, in _run
    feed_dict_string, options, run_metadata)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1015, in _do_run
    target_list, options, run_metadata)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1035, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [2,65,1] vs. [2,100,65]
     [[Node: gradients/mul_grad/BroadcastGradientArgs = BroadcastGradientArgs[T=DT_INT32, _class=["loc:@mul"], _device="/job:localhost/replica:0/task:0/cpu:0"](gradients/mul_grad/Shape, gradients/mul_grad/Shape_1)]]

Caused by op 'gradients/mul_grad/BroadcastGradientArgs', defined at:
  File "/Users/nakul/.virtualenvs/kur/bin/kur", line 11, in <module>
    sys.exit(main())
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/__main__.py", line 385, in main
    sys.exit(args.func(args) or 0)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/__main__.py", line 63, in train
    func(step=args.step)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/kurfile.py", line 414, in func
    return trainer.train(**defaults)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/model/executor.py", line 290, in train
    **kwargs
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/model/executor.py", line 725, in wrapped_train
    self.compile('train', with_provider=provider)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/model/executor.py", line 114, in compile
    **kwargs
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/backend/keras_backend.py", line 653, in compile
    compiled.trainable_weights, total_loss
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/optimizer/optimizer.py", line 47, in optimize
    return keras_optimizer.get_updates(weights, [], loss)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/keras/optimizers.py", line 381, in get_updates
    grads = self.get_gradients(loss, params)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/keras/optimizers.py", line 47, in get_gradients
    grads = K.gradients(loss, params)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2108, in gradients
    return tf.gradients(loss, variables, colocate_gradients_with_ops=True)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/ops/gradients_impl.py", line 482, in gradients
    in_grads = grad_fn(op, *out_grads)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/ops/math_grad.py", line 610, in _MulGrad
    rx, ry = gen_array_ops._broadcast_gradient_args(sx, sy)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 411, in _broadcast_gradient_args
    name=name)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op
    op_def=op_def)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2395, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1264, in __init__
    self._traceback = _extract_stack()

...which was originally created as op 'mul', defined at:
  File "/Users/nakul/.virtualenvs/kur/bin/kur", line 11, in <module>
    sys.exit(main())
[elided 5 identical lines from previous traceback]
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/model/executor.py", line 114, in compile
    **kwargs
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/backend/keras_backend.py", line 650, in compile
    self.process_loss(model, loss)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/backend/keras_backend.py", line 566, in process_loss
    self.find_compiled_layer_by_name(model, target)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/loss/categorical_crossentropy.py", line 36, in get_loss
    return keras_wrap(model, target, output, 'categorical_crossentropy')
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/loss/loss.py", line 37, in keras_wrap
    out = loss(ins[0][1], output)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/keras/losses.py", line 37, in categorical_crossentropy
    return K.categorical_crossentropy(y_pred, y_true)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2552, in categorical_crossentropy
    return - tf.reduce_sum(target * tf.log(output),
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 884, in binary_op_wrapper
    return func(x, y, name=name)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 1105, in _mul_dispatch
    return gen_math_ops._mul(x, y, name=name)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/ops/gen_math_ops.py", line 1625, in _mul
    result = _op_def_lib.apply_op("Mul", x=x, y=y, name=name)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op
    op_def=op_def)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2395, in create_op
    original_op=self._default_original_op, op_def=op_def)

InvalidArgumentError (see above for traceback): Incompatible shapes: [2,65,1] vs. [2,100,65]
     [[Node: gradients/mul_grad/BroadcastGradientArgs = BroadcastGradientArgs[T=DT_INT32, _class=["loc:@mul"], _device="/job:localhost/replica:0/task:0/cpu:0"](gradients/mul_grad/Shape, gradients/mul_grad/Shape_1)]]

I'm unsure how to fix these errors. The above kurfile is not the final model. Ideally, I would like to implement the below LSTM model

image

Thanks

ajsyp commented 7 years ago

Hi Senade! There are a few things going on here.

Train with Optimizer You're right that the optimizer should have been properly parsed. This is a bug that was fixed in 7e2d3f6, which you can get in a bleeding-edge git checkout of Kur. If you don't want to update, then a workaround is to replace this:

optimizer: adam

with this:

optimizer:
  name: adam

Train without Optimzer This looks like a shape issue with your data. Your inputs are shape (1115294, 100, 1), that is, you have just over a million samples, each of shape (100, 1). Perfect. That matches with what you explicitly specified in your Kurfile. Now let's watch how the shape changes throughout the model. The first layer is the RNN, with size 256. This is going to turn each (100, 1) sample into a (100, 256) sample. Next, you pass it through a dropout layer, which doesn't affect shapes. Then you give it to a dense layer of size 65. That'll cast each (100, 256) sample into a (100, 65) sample. Finally, you have an activation layer that will not change shapes. The problem? This output---(100, 65)---is not the same as the output of your data, which is (39, ).

The Solution Looking at your prepare script, you probably are trying to predict the next character given the preceding 100 characters. You probably wanted the RNN to spit out only the last vector of the sequence (equivalent to Keras' return_sequences=False). You probably also wanted the dense layer to map down to your 39-dimension output, not 65. If that's true, then this is the model section you wanted:

model:
  - input:
      shape: [100, 1]
    name: in
  # Sample shape is currently (100, 1)
  - recurrent:
      size: 256
      type: lstm
      sequence: no
  # Sample shape is currently (256, )
  - dropout: 0.2
  # Sample shape is currently (256, )
  - dense: 39
  # Sample shape is currently (39, )
  - activation: softmax
    name: out

There! Now everything is as you want it.

Even More You mentioned that this wasn't the model you really wanted, so let me help you out by writing the Keras model as a Kur model:

model:
    - input:
      shape: [100, 1]
    name: in
  - rnn:
      type: lstm
      size: 512
  - dropout: 0.2
  - recurrent:
      size: 512
      type: lstm
      sequence: no
  - dropout: 0.2
  - dense: 39
  - activation: softmax
  - output: out

There! Note that Kur can infer shapes, so you could have simplified the input section down to input: in (dropping the name and shape entries).

nakulcr7 commented 7 years ago

Thank you so much for the clear explanation. I'm a beginner and this makes much more sense now. I'll continue working with this over the weekend :)