XifengGuo / CapsNet-Keras

A Keras implementation of CapsNet in NIPS2017 paper "Dynamic Routing Between Capsules". Now test error = 0.34%.
MIT License
2.46k stars 652 forks source link

Data representation [x,y],[y,x] for training #78

Closed vandana-rajan closed 6 years ago

vandana-rajan commented 6 years ago

In the 'train' function, why is data given as [x,y],[y,x]? For example,

  1. In 'train_generator' function

    yield ([x_batch, y_batch], [y_batch, x_batch])

  2. In 'fit_generator' function

    validation_data=[[x_test, y_test], [y_test, x_test]]

  3. model.fit([x_train, y_train], [y_train, x_train], .....

I am new to Keras framework. Please help me understand why data is given like this.

XifengGuo commented 6 years ago

@vandana-rajan We use model.fit(X,Y) to train a one-in-one-out model and model.fit([X1,X2], [Y1,Y2]) to train two-in-two-out model. The Capsnet model two-in-two-out, so the inputs=[X1,X2]=[x_train, y_train] and the outputs (also the targets)=[Y1,Y2]=[y_train,x_train].

Please refer to https://keras.io/getting-started/functional-api-guide/#multi-input-and-multi-output-models for more details.

vandana-rajan commented 6 years ago

@XifengGuo

Okay. I went through the link and kind of understood multi-input-multi-output models in Keras. But where exactly in capsnet is this 2-input stage coming? As I understand it, the architecture of capsnet is like this,

Input->conv2d layer->primary caps->digitcaps->FCN1->FCN2->FCN3

Can you tell me where in the code can I observe this 2-input/2-output stage? (In the code provided in the link that you gave, keras.layers.concatenate tells the point where 2 inputs meet).

Apologies for these might-be-silly questions.

JoyJulianGomes commented 6 years ago

@vandana-rajan I was stuck in this issue too. So far what I understand is that the capsnet model is predicting a label and a reconstructed image - two outputs which have different loss functions. If you look at the result/model.png, Image of model you can see there are two inputs input_1 and input_2 and two outputs decoder_sequential and capsnet, hence two-in-two-out model

vandana-rajan commented 6 years ago

okay. I understood it now. Thanks @XifengGuo and @JoyJulianGomes