Closed francisco64 closed 1 year ago
@francisco64 Hello, thanks for the issue. Currently, there is no way to deal with huge dataset. And I'm currently focusing on benchmarking algorithms to release v1.0.0. But the huge dataset will be the first issue after v1.0.0 release. Hope you stay tuned then.
Hello @takuseno, first say that this offline reinforced learning library is a great job. We have done some tests in a robotics project that we are developing and we have obtained good results.
However, we face the same problem as @francisco64, when we want to expand the size of the dataset we overflow the memory.
In our case we do not have the possibility of generating an environment of our robotic system due to the complexity to synchronize the observations and actions.
We have several questions. Does the d3rlpy library contemplate incorporating partial batch learning? That is, accumulate knowledge in the same model with several datasets that collect similar cases.
On the other hand, is there any possibility with the current code of being able to carry out this form of training, even if it involves making a small modification in the library?
Thank you in advance.
Hi @Consultaing.
Please let me confirm your questions.
In the first question, you're trying to mix different datasets to make up a single large dataset?
For the second question, you're trying to implement your own training loop code? (sorry, I did not get the meaning of possibility with the current code of being able to carry out this form of training
).
Thank you for using my library by the way 😄
sorry if I don't explain it well.
I present you our work case. We robotically cutting meat and use a routine to teach a few basic cuts. With this routine we get 500 cutting episodes with observation images, their respective actions, and later we assign the rewards of the task at the end of each cut. This dataset enters memory at the limit, and we trained a model that begins to show signs of good behavior. If we now obtain another 500 episodes we cannot work with a 1000 episodes dataset because memory overflows, our goal is to continue training the model with these 500 new episodes, but we have the problem that, if we load the model in memory and retrain about it, we forget the above information.
We are looking for an option similar to what we can find in the sklearn MLP which is partial_fit () see below:
https://scikit-learn.org/0.15/modules/scaling_strategies.html
My second question is if there is currently something in d3rlpy library that can be used like sklearn's partial_fit ().
If you need any more information, tell me.
Thank you.
Thank you for the explanation. Now I just wonder what the size of the dataset is. In Atari 2600 case, d3rlpy only needs 8GiB of memory for 1M 84x84 gray-scale images. In the robotics tasks, I cannot imagine we can collect more data than that because of the hardware limitations.
Our case is more complex, first we use an image of (120, 68), this image is composed of a grayscale image of the robotic effector, a depth image, and in two other channels binarized information from various sensors.
Each episode on average can have about 50 steps.
In total (n, 50, 120, 68, 4).
We have done tests by vectoring this information in a size of (n, 325) and we were able to successfully solve the memory problem, but we were not able to transfer all the information to the model (worse performance, by reducing characteristics when we explore other situations we do not transfer that information in the vector, for this reason we want to keep the use of images)
If you store all the data in uint8
, the memory issue won't be a problem.
50x120x68x4x500 / 2^30 = 0.759GiB
The caveat is that d3rlpy does not currently support a tuple observation where each tuple can consist of the image part and the vector part. Is this issue problematic in your work?
I guess you're aware of this. But, the observations
should have the channel-first shape like this (n * 50, 4, 120, 68)
when you create the MDPDataset
.
Well, in addition to the fact that there are several groups of 500, our machine has some limitations per project.
Thank you very much for the explanations, we will try to reduce the observation data. In any case, we will closely follow the development of d3rlpy and the partial training option would be highly desirable.
as I said a great job this library!
Hi @Consultaing just use the fitter interface for training instead of fit.
You can create as many fitters as you want. Each fitter can have is own dataset input, so not only you just can train on one another dataset after an initial train. You can just traing them in paralel.
The fitter interface is just an iterator.
see:
[notebook] (https://gist.github.com/jamartinh/f4ca4fa53e8bd256f91012aa7bedf625))
Hi @jamartinh
Thanks for the advice.
Finally we made changes in our input data with the latent layers of autoencoders for the channels of the images and joined them in a vector with our observations of sensors.
@takuseno I wonder if this issue has been resolved. I am also running into a problem dataset does fit into the CPU memory. Is there a way to stream it from disk? Btw, thank you for a nice Offline RL package!
I didn't really test this, but you should be able to use numpy.memmap
to deal with dataset larger than RAM.
https://numpy.org/doc/stable/reference/generated/numpy.memmap.html
I believe this is the easiest way to achieve what you want to do.
Thanks a lot @takuseno , I will test it out.
One question, the algorithms query the dataset at randomized indices, right? Asking because random access from disk is quite slow. However, accessing chunk of contiguous data (i.e., x[i: i+chunk_size]
) at once and then iterating through it should work well.
That's right. One thing you can do is to call fit
multiple times while copying sampled episodes to memory:
# ReplayBuffer filled with memmap array
dataset = ...
for epoch in range(100):
# randomly sample episodes
episodes = ...
in_memory_episodes = []
for episode in episodes:
in_memory_episode = d3rlpy.dataset.Episode(
observations=episode.observations.copy(),
actions=episode.actions.copy(),
rewards=episode.rewards.copy(),
terminated=bool(episode.terminated),
)
in_memory_episodes.append(in_memory_episode)
in_memory_dataset = d3rlpy.dataset.create_infinite_buffer(in_memory_episodes)
algo.fit(in_memory_dataset, n_steps=xxx, n_steps_per_epoch=yyy)
That's a nice solution, thanks!
For now, I believe the solution I described above is the best. Let me close this issue. Feel free to reopen this if there is any further discussion.
Dear Takuseno, Thanks very much for your great work on d3rlpy. I'm currently trying to input 6*6 matrix of observation (state) into a customized encoder factory base on attention mechanism. But it reports a error of wrong size.
Can you helo me?
@CastleImitation Hi, thanks for the issue, but could you make a new issue and continue this discussion there since this is unrelated topic to this thread? Note that when you make a new issue, please share the error message and the minimal example that I can reproduce.
Noted sir!
Felix Li | |
---|---|
@. | ---- Replied Message ---- | From | Takuma @.> | | Date | 2/11/2024 11:28 | | To | @.> | | Cc | @.>, @.***> | | Subject | Re: [takuseno/d3rlpy] Dataset doesn't fit in memory (#140) |
@CastleImitation Hi, thanks for the issue, but could you make a new issue and continue this discussion since this is unrelated topic to this thread? Note that when you make a new issue, please share the error message and the minimal example that I can reproduce.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
Hi! Is there a way to train from an iterator dataset or something that does not require loading the entire dataset to memory? My dataset is too large to fit in any memory.
Thanks for your help!