Closed ThierryDeruyttere closed 5 years ago
Thanks a lot! :)
Since I made the visualization you mention on the paper the code has gone through refactoring and clean-up, so not strictly the exact same code I used then, but the current version should definitely work just as well - the command is:
python main.py --expName "experiment-interpretable" --train --testedNum 10000 --epochs 25 --netLength 4 @configs/args1.txt
args1.txt
is the exact configuration that was described in the paper, and 3-6 cells (netLength
) should produce most interpretable attentions (whereas 8-16 give a bit higher overall accuracy but naturally less interpretable since there are many steps).
Hope it helps!
Hi @dorarad , thanks for the quick answer
I had already tested it with those exact arguments but just wanted to be sure. With my first test my validation accuracy was quite low so I took a fresh clone from the repo and retrained the network with the command you provided. Currently at epoch 24 I have the following values: Training Loss: 0.8928986946372279, Training accuracy: 0.5416270827112998 Training EMA Loss: 0.8324600319729415, Training EMA accuracy: 0.5857 Validation Loss: 1.0158769098195162, Validation accuracy: 0.4699
Are these values ok? Because the validation accuracy seems quite low compared to what was written in the paper. Could I maybe ask you to check this too? Thanks in advance!
PS: Could you maybe also reopen this issue? :) thanks!
Edit: Here are the values for epoch 25
Training Loss: 0.8776571237122207, Training accuracy: 0.5537972739571622 Training EMA Loss: 0.8108495558482388, Training EMA accuracy: 0.6003 Validation Loss: 1.0294084859616828, Validation accuracy: 0.4673
After further thinking, maybe reopening this issue was not the best idea. Do you prefer that I create a new issue for this?
Hi, sorry for closing! the values are definitely not what they are supposed to be it looks like a bug with one of the settings of args1 I will look into that! In the meantime I think it may be worth trying with standard args.txt I'm pretty sure the attention maps should be good for these settings as well (although will have to verify).
I will give the standard args a try and report back :)
awesome thanks a lot! :)
So the network has trained for 6 epochs already and currently these are the scores: Training Loss: 0.9706893796575977, Training accuracy: 0.4714688373674443 Training EMA Loss: 0.9560210005251947, Training EMA accuracy: 0.485 Validation Loss: 0.962439235051473, Validation accuracy: 0.4775
So I think there might have crept some issue when refactoring. But could you please check this on your part?
Alright thanks a lot for pointing that out! Clearly something got messed up I will check that and get back to you as soon as possible! (It worked for sure with the standard configuration arg.txt e.g. a few months ago after the refactoring and since then I didn't make much changes so I believe it shouldn't be too hard to find the problem will check)
I know 😄 I had a working copy but then I had to move servers and lost it. I have btw seen that the lr always stay 0.0001 in previous version (that worked) the lr went to 1.25e^-5 at epoch 25 so maybe the error is there.
thanks for the info! the lr gets reduces if val scores don't improve so that's no the reason, but I'm sure I'll find the problem in no time! :)
Any update on this? :P
I really apologize for the delay! things are a little busy right now so I couldn't yet find time to look into this but I will update as soon as I resolve it!
Hi as it's been a long time now I was wondering if you had any update on this?
hey sorry not yet :/ I know it's really been a lot of time I definitely plan to look into it but haven't got the time yet I do apologize! It may take a bit of time but will definitely update when I have new info! If you need it in the meantime - maybe a good solution can be to simply pull a prior version from the time when it worked for you? e.g. https://github.com/stanfordnlp/mac-network/tree/0085972777113170563f6c247dbdc82f16277799 ?
I went over all the commits afterwards and there are only very slight typos fixes in the code in master branch since then, but if an earlier version worked for you then, it has to work again - since then both data and model stay the same. Sorry that I don't have a better solution currently!
Hi!
I'll clone the version you just proposed and i'll try it again. I'll let you know in a couple of minutes! The reason I ask is because we're creating a new dataset and would like to use your model together with other models as a baseline so I just want to make sure that the implementation is correct.
Alright sounds good! let me know how it goes! :) I don't see a reason why it would stop working, maybe something got messed up in the features files for CLEVR? Also make sure you train over the full dataset (there's a flag for that).
I haven't used CLEVR for a while but if you're anyway interested in using the model for another dataset then there's even less of a problem - I'm right now using this model for other datasets such as VQA/GQA and it works fine. In particular there's the GQA branch with a more updated version (though a bit less clean) of the code which I use for these datasets.
Looking forward to the release of the new dataset!
I will :). Yea i'm using the command you provided earlier so I suppose that should work, right? And the remark about the feature files might actually be a very good remark. I'll let you know soon. I'm almost at the end of epoch 1 and I have 43% accuracy
How much time each epoch takes you? hmm maybe I know the source of your error - can you run a new experiment with the following flags: --generatedPrefix "newfeatures" --expName "newexp"
? It will make sure you generate new feature files for the questions and don't work with temporary files that got generated for previous experiments.
The first epoch took 2347.42 seconds. I will run a new experiment with the proposed flags
i'm at epoch 5 and these are my results eb 5,10945 (699989 / 699989), t = 0.17 (0.00+0.17), lr 0.0001, l = 0.6771, a = 0.6250, avL = 0.9714, avA = 0.4727, g = 1.2093, emL = 0.9602, emA = 0.4678; newexp so average accuracy is still only 47%. I'm going to try to remake the features and report back
alright let me know how it goes - maybe if the run of the validation features extraction was incomplete the model may then get zeros instead of real features for some or all val images, so worths trying making new ones
Hi! I got very good news.
took 1722.15 seconds Training Loss: 0.25686724214361023, Training accuracy: 0.8955540729925756 Training EMA Loss: 0.15401203917713385, Training EMA accuracy: 0.9398 Validation Loss: 0.17199736024297427, Validation accuracy: 0.9282 Training epoch 5... eb 5,3000 (191928 / 699989), t = 0.22 (0.00+0.18), lr 0.0001, l = 0.0526, a = 0.9844, avL = 0.2241, avA = 0.9094, g = 3.2902, emL = 0.2145, emA = 0.9161; newexpp saving weights eb 5,6000 (383709 / 699989), t = 0.22 (0.00+0.19), lr 0.0001, l = 0.4795, a = 0.7969, avL = 0.2223, avA = 0.9101, g = 4.0710, emL = 0.2386, emA = 0.9063; newexpp saving weights eb 5,9000 (575549 / 699989), t = 0.19 (0.00+0.16), lr 0.0001, l = 0.2768, a = 0.9375, avL = 0.2216, avA = 0.9104, g = 5.6670, emL = 0.2320, emA = 0.9051; newexpp saving weights eb 5,10945 (699989 / 699989), t = 0.14 (0.00+0.14), lr 0.0001, l = 0.1774, a = 0.9531, avL = 0.2199, avA = 0.9111, g = 3.0387, emL = 0.2147, emA = 0.9142; newexpp Restoring EMA weights eb 5,164 (10000 / 10000), t = 0.06 (0.00+0.06), lr 0.0001, l = 0.1906, a = 0.9219, avL = 0.1316, avA = 0.9481, g = -1.0000, emL = 0.1325, emA = 0.9472; newexp eb 5,165 (10000 / 10000), t = 0.05 (0.00+0.05), lr 0.0001, l = 0.0054, a = 1.0000, avL = 0.1587, avA = 0.9390, g = -1.0000, emL = 0.1421, emA = 0.9481; newexp Restoring standard weights
It got finally fixed! The issue was indeed with the features itself. I'm glad we sorted this out.
awesome!! Glad it got fixed! :)
Hi there!
First of all thanks for publishing the code! I really enjoy this work :-). I would like to ask if you could maybe share the exact parameters you used to create the network that produces this: https://camo.githubusercontent.com/e9e9464bfc10736d86b150ada2d8f68e74d3afae/68747470733a2f2f63732e7374616e666f72642e6564752f70656f706c652f646f72617261642f6d61632f696d67732f76697375616c2e706e67
Thanks in advance!