MushroomRL / mushroom-rl

Python library for Reinforcement Learning.
MIT License
811 stars 146 forks source link

compress frames #117

Open davidenitti opened 1 year ago

davidenitti commented 1 year ago

Is your feature request related to a problem? Please describe. I would like to reduce the memory taken by RL with atari, so I can run many experiments at the same time.

Describe the solution you'd like compress the frames, for example in rllib they use LZ4 compression. This is different from the lazyframes

if I want to implement this by myself, where should I make the change?

boris-il-forte commented 1 year ago

Basically, you have to change the Atari class. In particular these two lines: https://github.com/MushroomRL/mushroom-rl/blob/035951210f63b0ccd0d5fe63b4bc264aca2b51e5/mushroom_rl/environments/atari.py#L111 https://github.com/MushroomRL/mushroom-rl/blob/035951210f63b0ccd0d5fe63b4bc264aca2b51e5/mushroom_rl/environments/atari.py#L136

And substitute the LazyFrame with your compressed frame. This should be enough. Obviously, you need to implement some important methods: 1) a way to feed the images to the NN in the appropriate format 2) copy method (needed by mushroom core) 3) a shape method

Basically, you need to produce an alternative class to LazyFrames:

https://github.com/MushroomRL/mushroom-rl/blob/035951210f63b0ccd0d5fe63b4bc264aca2b51e5/mushroom_rl/utils/frames.py#L7-L36

davidenitti commented 1 year ago

I did it, if you are interested I modified LazyFrames:

import numpy as np
import cv2
import blosc2
cv2.ocl.setUseOpenCL(False)

class LazyFrames(object):
    """
    From OpenAI Baseline.
    https://github.com/openai/baselines/blob/master/baselines/common/atari_wrappers.py

    This class provides a solution to optimize the use of memory when
    concatenating different frames, e.g. Atari frames in DQN. The frames are
    individually stored in a list and, when numpy arrays containing them are
    created, the reference to each frame is used instead of a copy.

    """
    def __init__(self, frames, history_length, compress=True):
        self._frames = frames
        self._compress = compress
        if self._compress:
            for s in range(len(self._frames)):
                if isinstance(self._frames[s], np.ndarray):
                    self._frames[s] = (blosc2.compress(self._frames[s]), self._frames[s].shape, self._frames[s].dtype)
        assert len(self._frames) == history_length

    def __array__(self, dtype=None):
        if isinstance(self._frames[0],tuple):
            assert self._compress
            for fi in self._frames:
                assert len(fi)==3
            shape=self._frames[0][1]
            frames = [np.frombuffer(blosc2.decompress(compressed_data), dtype=dtype)
                      for compressed_data,_,dtype in self._frames]
            for f in frames:
                f.shape=shape
        else:
            frames = self._frames
        out = np.array(frames)
        if dtype is not None:
            out = out.astype(dtype)

        return out

    def copy(self):
        return self

    @property
    def shape(self):
        return (len(self._frames),) + self._frames[0].shape