Unify observation normalization code

The MPI-parallelized algorithms (e.g. DDPG, TRPO_MPI) are the ones calling update within algorithm, and the subprocess-parallelized algorithms use VecNormalize? (I guess PPO2's normalization was problematic due to supporting both types of parallization #695.) Which is the preferred way to unify?

Possibly related: Are there plans for supporting MPI parallelization for other algorithms that currently only support subprocess-parallelization (e.g. ACKTR, ACER, A2C)? I was happy to see this added to PPO2, as I didn't realize an algorithm could support both. MPI is crucial when environments have a large variance in the cost of step, or, for example, variable-length episodic environments where reset is computationally expensive (in this case, each subprocess waiting for the one that got reset is killer).

openai / baselines

Unify observation normalization code #698