Introduce RAM-efficient seq buffer

This PR introduces RAM-efficient seq buffer in buffers/seq_replay_buffer_efficient.py, which stores observations only once, unlike the vanilla seq buffer in buffers/seq_replay_buffer_vanilla.py storing twice. Thus it can reduce RAM roughly by 2x, especially useful for large observation space such as pixel inputs.

I test the correctness of this buffer in __main__, if you are interested, please run

python buffers/seq_replay_buffer_efficient.py

to double check.

To use it, set train: buffer_type: "seq_efficient" in config file. The default buffer type is still the vanilla one, to keep consistent with the paper's results. However, we recommend to use the efficient one, especially for atari games.

twni2016 / pomdp-baselines

Introduce RAM-efficient seq buffer #8