Open daniellawson9999 opened 1 year ago
Daniel, just wanted to say this is an excellent issue. Thank you for doing this!
Currently working on approach 2 in this issue https://github.com/ManifoldRG/NEKO/issues/15
@helenlu66 I have moved this to "in progress" to reflect my understanding of your PPP
Currently working on converting the GoToLocal expert trajectories .pkl to Minari in this issue
currently resolving dependency issues between h5py and minari
currently pursuing approach 1 here https://github.com/helenlu66/RLMinariDatasets/blob/master/babyai_bot_expert_data_generation.py since approach 2 only led me to find one dataset and porting that dataset to Minari is currently blocked due to unpicklingError
Hi @helenlu66 could you solve this? I recently started researching this.
Background
BabyAI is a "gridworld environment whose levels consist of instruction-following tasks that are described by a synthetic language". Gato generates their dataset using the built-in BabyAI bot, with more details that can be found in the paper.
The original repo is now being maintained under Farama as well as MiniGrid. In the 2023 update of the BabyAI repo, it discusses this change and also says:
"This repository still contains scripts which, if adapted to the Minigrid library, could be used to:
More info regarding minigrid can be found here: https://minigrid.farama.org/. There are both the original BabyAI environments and MiniGrid environments provided.
Tasks
As in issue https://github.com/ManifoldRG/NEKO/issues/13, requirement (1) is that environmets meet the Gymnasium API, this is already accomplished, as the Minigrid repo follows the new API.
The uncompleted task is sourcing a dataset, and porting it to Minari, requirement (2). There are several paths to sourcing a dataset:
1) Collect dataset manually, using the BabyAI bot, which may have to be adapted to meet the new Minigrid repo https://github.com/mila-iqia/babyai/blob/master/babyai/bot.py .
2) See if papers using Minigrid/BabyAI provide datasets, some papers can be found here: https://minigrid.farama.org/content/publications/ . In this case, a dataset just needs to be converted to Minari.
3) Collaborate with Minari on sourcing the dataset. In this repo, it says that more datasets are to come to Minari https://github.com/rodrigodelazcano/d4rl-minari-dataset-generation. At the end of the reame, it includes Minigrid. Potentially reach out to https://github.com/rodrigodelazcano, or others at Minari, discord can be found here: https://farama.org/
If interested, please add yourself to this issue, and discuss which path you are pursuing.
Output
The output should be a link to a GitHub repo that provides a process for acquiring the dataset as in https://github.com/daniellawson9999/data-tests.