tkipf / relational-gcn

Keras-based implementation of Relational Graph Convolutional Networks
MIT License
793 stars 134 forks source link

A question about model.fit in keras #7

Open zhixiaochuan12 opened 5 years ago

zhixiaochuan12 commented 5 years ago

Hi, I want to extend your code to implement a binary link prediction task, the train_X is the indexes of nodes in graph with the shape of (train_size,2), each pair is like [node1_idx, node2_idx], and train_label's shape is (train_size,)

After the GraphConvolution Layer(the output is gc_output), I added a DistMult Layer to calculate the triple score e1 * R * e2, it was realized by gc_output[train_idx[:,0]] * R * gc_output[train_idx[:,1]]. The total Model(named model) consists of GraphConvolution layer and DistMult layer.

The problem I encountered was that when I fit data in Model like model.fit([X]+A, y_train), keras threw the errorValueError: Input arrays should have the same number of samples as target arrays. Found 74940 input samples and 11519 target samples. The sample_weight method used in your code seems no help for me. My friends suggested that pytorch can easily do this, but rewriting will cost more time.

I am new to keras, and I will really appreciate it if you can give me any suggestion related to this problem.

tkipf commented 5 years ago

Yes, this is unfortunately not possible in Keras without significantly rewriting core parts of the framework, due to some internal limitations in how keras handles tensor shapes. I would recommend to go with this recent framework in PyTorch (they have an R-GCN link prediction example): https://github.com/dmlc/dgl

On Tue, Dec 25, 2018 at 8:56 AM Yuji Yang notifications@github.com wrote:

Hi, I want to extend your code to implement a binary link prediction task, the train_X is the indexes of nodes in graph with the shape of (train_size,2), each pair is like [node1_idx, node2_idx], and train_label's shape is (train_size,)

After the GraphConvolution Layer(the output is gc_output), I added a DisMult Layer to calculate the triple score e1 R e2, it was realized by gc_output[train_idx[:,0]] R gc_output[train_idx[:,1]]. The total Model(named model) consists of GraphConvolution layer and DisMult layer.

The problem I encountered was that when I fit data in Model like model.fit([X]+A, y_train), keras threw the errorValueError: Input arrays should have the same number of samples as target arrays. Found 74940 input samples and 11519 target samples. The sample_weight method used in your code seems no help for me. My friends suggested that pytorch can easily do this, but rewriting will cost more time.

I am new to keras, and I will really appreciate it if you can give me any suggestion related to this problem.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tkipf/relational-gcn/issues/7, or mute the thread https://github.com/notifications/unsubscribe-auth/AHAcYA0Ce55alTBDSZoekwASDXaYs7eHks5u8do2gaJpZM4Zg5zg .

zhixiaochuan12 commented 5 years ago

Thanks for replying.

I finally repeated the y_train and forced the size of x_train and y_train be size-equal and wrote a new loss for this.