Scalable deep amortized inference on GPU

marcoct commented 7 years ago

Current solution makes it possible to do amortized inference using neural networks, but some performance optimization and scaling will probably be necessary to train deep networks. Currently a Gen.jl AD system is used for neural network backprop, and basic parallelization across CPUs is done for model simulation and neural network backprop (i.e. a single CPU runs a model simulation and then a backprop pass in the neural network, at a time). There is probably plenty of opportunities for performance optimization.

More generally, this may involve a combination of profiling, integrating with existing deep learning systems, performance engineering, and research into programming and training workflows for amortized inference.

Currently, training a single-layer neural network with 50 hidden units and 20 input features, to predict the waypoint in the goal inference task, with 32 cores used to distribute minibatches of size 32 takes a few hours to get noticeably useful inferences (requiring 10,000-20,000 iterations of ADAM SGD).

marcoct commented 7 years ago

Relevant: https://github.com/JuliaGPU

marcoct commented 7 years ago

Relevant: https://github.com/malmaud/TensorFlow.jl

probcomp / GenExperimental.jl

Scalable deep amortized inference on GPU #42