In this PR, I am adding a Reproducible Researchsection in the documentation. This section only contains for now a note on how to reproduce the experiments described in the ViGIL paper, which can be taken as an example for the future notes in this section.
Main points:
Added the S-MAC model to miprometheus.models. It reuses as many units as possible from the MAC model implementation, such that it only implements the units which are different in terms of equations. Its documentation reflects the published paper (bibtex is indicated).
Added all the grid config files to run the experiments:
The initial training on CLEVR & CoGenT-A,
The finetuning on CoGenT-B of the CLEVR- & CoGenT-A-trained models,
The finetuning on CoGenT-A of the CLEVR-trained models,
Default configuration files are available for CLEVR / CogenT / MAC on CLEVR / MAC on CoGenT / S_MAC on CLEVR / S-MAC on CoGenT. The grid config files reuse these. The grid config files are pretty long and complex because the configuration for each experiment is different, but everything should be commented.
All tests experiment are indicated in the grid config files with the multi_tests key (Was able to do so after the merge of #98 and #100). Adding support for multi-tests in the Tester was long to implement, but is very useful here as we can run "cross-tests" (e.g. train on CoGenT-A and test on CoGenT-B) with the same command as running "regular tests" (e.g. train on CoGenT-A and test on CoGenT-A).
To allow the proper test of a CLEVR-trained model on CoGenT samples, we need to make sure that the dicts {'words': index} & {'answer': index} are the same for both CoGenT & CLEVR, but also that the random embedding weights are the same. I formalized that under the form of an additional param for the CLEVR class (embedding_source, doc has been updated, and the CLEVR class throws a warning).
Overall, the pipeline is to run mip-grid-trainer-gpu, mip-grid-tester-gpu and mip-grid-analyzer 3 times.
This documentation section should be self-consistent, in that all config files (and indices files for the Sampler) are linked (I provide ours for transparency). All commands are indicated.
note:
I am not using Exponential Moving Average as this is not implemented yet (was done on the internal repo). I am also not sure that this is making a major difference in performance. We may want to implement this in the future.
I have tested the overall and it should be working. I cannot 100% guarantee that I squashed every possible bug here, but most should be fixed :slightly_smiling_face:
:warning: A few things do need to be fixed to ensure that this can reproduced anywhere:
Ensure that when indicating default_configs, the path to these files can be properly constructed from the path of the specified initial config file (with --c). I gave a quick shot at this, and it may not be as straightforward as initially thought: we need to handle the case when we specify several config files (separated by commas), and also in the GridWorkers when Named Temporary Files are created and then superseded by the config params. Cf #16,
Automate the downloads of the dataset files (cf #101 ),
Resolve #104, i.e. ensure that all loggers (Problem, Model, SamplerFactory...) can properly log to the console and the log file (although this is not critical),
Ensure that when starting a grid experiment, each single experiment is moved to one GPU (link #37). I have set the sleep time to 60s in this PR.
In this PR, I am adding a Reproducible Research section in the documentation. This section only contains for now a note on how to reproduce the experiments described in the ViGIL paper, which can be taken as an example for the future notes in this section.
Main points:
Added the S-MAC model to
miprometheus.models
. It reuses as many units as possible from the MAC model implementation, such that it only implements the units which are different in terms of equations. Its documentation reflects the published paper (bibtex is indicated).Added all the grid config files to run the experiments:
Default configuration files are available for CLEVR / CogenT / MAC on CLEVR / MAC on CoGenT / S_MAC on CLEVR / S-MAC on CoGenT. The grid config files reuse these. The grid config files are pretty long and complex because the configuration for each experiment is different, but everything should be commented.
multi_tests
key (Was able to do so after the merge of #98 and #100). Adding support for multi-tests in theTester
was long to implement, but is very useful here as we can run "cross-tests" (e.g. train on CoGenT-A and test on CoGenT-B) with the same command as running "regular tests" (e.g. train on CoGenT-A and test on CoGenT-A).{'words': index}
&{'answer': index}
are the same for both CoGenT & CLEVR, but also that the random embedding weights are the same. I formalized that under the form of an additional param for theCLEVR
class (embedding_source
, doc has been updated, and theCLEVR
class throws a warning).Overall, the pipeline is to run
mip-grid-trainer-gpu
,mip-grid-tester-gpu
andmip-grid-analyzer
3 times.This documentation section should be self-consistent, in that all config files (and indices files for the
Sampler
) are linked (I provide ours for transparency). All commands are indicated.note:
I have tested the overall and it should be working. I cannot 100% guarantee that I squashed every possible bug here, but most should be fixed :slightly_smiling_face:
:warning: A few things do need to be fixed to ensure that this can reproduced anywhere:
default_configs
, the path to these files can be properly constructed from the path of the specified initial config file (with--c
). I gave a quick shot at this, and it may not be as straightforward as initially thought: we need to handle the case when we specify several config files (separated by commas), and also in theGridWorkers
whenNamed Temporary Files
are created and then superseded by theconfig params
. Cf #16,Problem
,Model
,SamplerFactory
...) can properly log to the console and the log file (although this is not critical),sleep
time to 60s in this PR.