Open bzamecnik opened 7 years ago
It appears for placeholders model_1_sample_weights
and model_1_targets
. It seems all calls to K.placeholder
have shape
fully specified, while in these two cases only ndim
is given, which results in shape (None)
or (None, None)
.
A hypothesis is that it might be caused by mismatch of size of predictions and targets within each replica (tower). Now we provide inputs and targets of full mini-batch size, but extract slices and compute tower predictions of of sub-batch size. Since we compute the loss within each tower (compared to the baseline solution - make_parallel()
) the size of predictions and targets might be different.
We would have to slice also the targets/sample weights. Or another solution would be perform sub-batch slicing in Keras and feed slices for each tower separately.
It seems that there are some placeholders that are not assigned in session.run()
via feed_dict
.
Placeholders:
>>> [op for op in g.get_operations() if op.type == 'Placeholder']
[<tf.Operation 'input_1' type=Placeholder>,
<tf.Operation 'dropout_1/keras_learning_phase' type=Placeholder>,
<tf.Operation 'replica_0_1/model_1_sample_weights' type=Placeholder>,
<tf.Operation 'replica_0_1/model_1_target' type=Placeholder>,
<tf.Operation 'replica_1_1/model_1_sample_weights' type=Placeholder>,
<tf.Operation 'replica_1_1/model_1_target' type=Placeholder>,
<tf.Operation 'concatenate_1_sample_weights' type=Placeholder>,
<tf.Operation 'concatenate_1_target' type=Placeholder>]
The above error is raised when a placeholder with dynamic dimensions (marked as ?
or None
) is not assigned a value. Error from incompatible shapes looks differently (see a small experiment).
In Model.compile()
placeholders for sample_weights
and targets
are created. Since we call compile for the replicas and also for the wrapping model we make several sets of these placeholders. However, during training we call fit() only on the wrapper model and thus do not feed values to the the placeholders in the replica models.
Running on 2 GPUs (GTX 1070):