Closed Jugg1er closed 11 months ago
Hi,
Lamorel expects at least two processes: one for the LLM (server) and one for the RL (client). It uses torch distributed to make these two processes communicate.
In a colab ther is a single process running, explaining the error you get. Right now I do not see how to make it work in a colab as it is really designed for a distributed setup (even though you can locally run these two processes as in the examples), I am sorry :/
Thank you for your reply! I'm not familiar with torch.distributed, and I will try to test it locally.
The examples in the readme use our launcher to create the distributed setup given a config file so you shouldn't need to be comfortable with torch.distributed to use lamorel :)
Do not hesitate if you have further questions!
I'm considering using Lamorel in my project, and I first tested it in Google Colab.
I follow the demo in README.md:
And I got this error: ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ in <cell line: 18>:22 │ │ in main:13 │ │ │ │ /content/lamorel/lamorel/src/lamorel/caller.py:19 in init │ │ │ │ 16 │ If the current process belongs to the LLM's processes, it will launch the LLM and wa │ │ 17 │ ''' │ │ 18 │ def init(self, config, custom_updater=None, custom_module_functions={}, custom_m │ │ ❱ 19 │ │ assert dist.is_initialized(), "torch distributed must be used!" │ │ 20 │ │ self.config = config │ │ 21 │ │ self.grad_fn_model = None │ │ 22 │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ AssertionError: torch distributed must be used!
I wonder if this is due to my environment or the config issues. Could you give me some advice?