keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.89k stars 19.45k forks source link

How does keras build batches depending on the batch-size? #8627

Closed stoney95 closed 6 years ago

stoney95 commented 6 years ago

I'm trying to implement a Sentiment-Classifier using keras. But i got some problems with the batch_size param. It might be kind of a specific problem and I didn't find anything online that helped me.

Here is an example how the model could look:

Input_1             Input_2             
(None, 200)         (None, 200)
  |                |
  |                |   
Embedding_1         Embedding_2
(None, 200, 200)        (None, 200, 200)
  |                |
  |                |
Reshape_1           Reshape_2
(32, 200, 200)          (32, 200, 200)
    \                  /
     \                            /
      -----------        ---------
                 \      /
        Concatenate
        (32,200,400)
              |
              |
              ...

Reshaping is done by a Lambda-Layer using the reshape-function from keras.backend:

output = Lambda(lambda x: bd.reshape(x, (batch_size, 200, em_dim)), name='Reshape_Batch_size')(embedded)

I reshape the outputs of both embeddings (also using the batch_size) because the keras Concatenate-Layer needs tensors with equal shape on each axis except from the one you concat on. To be able to do the reshape I fit the number of my samples to a multiple of the batch_size. If I want to train my network everything is working fine until the last batch of the epoch. There I got an error that the number of input_values do not match the expected number:

InvalidArgumentError (see above for traceback): Input to reshape is a tensor with 320000 values, but the requested shape has 1280000 [[Node: Reshape_Batch_size/Reshape = Reshape[T=DT_FLOAT, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Embedding_basic_em/Gather, Reshape_Batch_size/Reshape/shape)]]

320000 = 8 * 200 * 200
1280000 = 32 * 200 * 200

This looks like keras is cutting 24 samples from the last batch. But if I look at the output of the training it tells me there are 32 remaining samples:

Epoch 1/8 17440/17472 [============================>.] - ETA: 0s - loss: 5.1635 - acc: 0.6796

Does anybody know how keras handles batches depending on the given batch_size param or what I'm missing here? Or maybe someone has a hint how i can do the concatination without reshaping my outputs.

My Configuration I'm working on macOS Sierra version 10.12.6, using Python 3.5.3 from Anaconda 4.4.0 (x86_64). As keras backend I'm using tensorflow(1.4.0). keras is on version 2.1.1

Thanks in advance

dillongardner commented 6 years ago

Can you provide the code for the full model?

stoney95 commented 6 years ago

Already figured it out. Just had to delete the reshape-layers