loss stuck when using multi_gpu

I'm trying to use make_parallel() using Keras XCeption, and a generator which yields two classes, batch_size=2.

When using one gpu without make_parallel, the model gets to loss=0 acc=1 in 2 epochs. However, when using multi_gpu with gpus=2, the model gets stuck in acc=0.5 with loss=8.0591.

I'm guessing this is related somehow to the loss aggregation being collected only from one GPU instead of both, but I am not sure why.

When trying to train 4 classes, batch_size=4, the training gets to acc=0.97 after 11 epochs, while single gpu gets acc=1 within 2 epochs.

Any idea?

rossumai / keras-multi-gpu

loss stuck when using multi_gpu #4