kimhc6028 / relational-networks

Pytorch implementation of "A simple neural network module for relational reasoning" (Relational Networks)
https://arxiv.org/pdf/1706.01427.pdf
BSD 3-Clause "New" or "Revised" License
811 stars 162 forks source link

Thanks. I have repeat your result, but I wander the result in the paper #6

Closed robotzheng closed 7 years ago

robotzheng commented 7 years ago

Train Epoch: 18 [193280/196000 (99%)] Relations accuracy: 95% | Non-relations accuracy: 100% Train Epoch: 18 [194560/196000 (99%)] Relations accuracy: 86% | Non-relations accuracy: 100% Train Epoch: 18 [195840/196000 (100%)] Relations accuracy: 89% | Non-relations accuracy: 100%

Test set: Relation accuracy: 89% | Non-relation accuracy: 100% Train Epoch: 19 [192000/196000 (98%)] Relations accuracy: 94% | Non-relations accuracy: 100% Train Epoch: 19 [193280/196000 (99%)] Relations accuracy: 80% | Non-relations accuracy: 100% Train Epoch: 19 [194560/196000 (99%)] Relations accuracy: 89% | Non-relations accuracy: 100% Train Epoch: 19 [195840/196000 (100%)] Relations accuracy: 91% | Non-relations accuracy: 100%

Test set: Relation accuracy: 90% | Non-relation accuracy: 99% Train Epoch: 20 [193280/196000 (99%)] Relations accuracy: 91% | Non-relations accuracy: 100% Train Epoch: 20 [194560/196000 (99%)] Relations accuracy: 97% | Non-relations accuracy: 100% Train Epoch: 20 [195840/196000 (100%)] Relations accuracy: 95% | Non-relations accuracy: 100%

Test set: Relation accuracy: 89% | Non-relation accuracy: 99%

robotzheng commented 7 years ago

I have repeat your result, but I wander the result in the paper is "above 94% for both relational and non-relational questions", different CNNS?

kimhc6028 commented 7 years ago

There are several features different from the original paper.

  1. model architecture. Because the model used in original paper for Sort-of-CLEVR task was too big, I used much lighter model which is used in CLEVR task. This change may affect the result, though, this would make more people to test RN faster.

  2. even dataset is not actually equal. Sort-of-CLEVR task is not published to public, so I had to design it by myself. I found no clue of actual size of vector for answer (if you find any, please let me know). Unless the original dataset is published, I suspect it is quite hard to reproduce the result in the paper.

robotzheng commented 7 years ago

Good job! Thanks your reply. I'll run it in CLEVR_v1.0 and CLEVR-CoGenT_v1.0, any suggestions for the model architecture details?

kimhc6028 commented 7 years ago

Wow, good luck for CLEVR task!

  1. the image size is bigger for CLEVR task. Featuers after CNN would be bigger, so modification of RN is required. 2.coord_lst is also assumes to use 5 * 5 objects, so it also needs to be modified.

3.First input layer of RN assumes to input questions as a vector of size 11. I forgot size of the question vector of the CLEVR task, but you also need to modify it.

4.Learning rate should be changed to 2.5e-4?

And adopting LSTM for question input would run CLEVR task. That is all that comes across.

I appreciate if you share the result of CLEVR! Thank you

saharudra commented 6 years ago

Hi @robotzheng Were you able to replicate the results for the CLEVR task? I am currently implementing my own version of the paper in pytorch for CLEVR.