flowersteam / lamorel

Lamorel is a Python library designed for RL practitioners eager to use Large Language Models (LLMs).
MIT License
190 stars 18 forks source link

AssertionError: torch distributed must be used! #22

Closed Jugg1er closed 11 months ago

Jugg1er commented 11 months ago

I'm considering using Lamorel in my project, and I first tested it in Google Colab.

I follow the demo in README.md:

import hydra
from lamorel import Caller, lamorel_init
lamorel_init()

@hydra.main(config_path='../config', config_name='config')
def main(config_args):
    lm_server = Caller(config_args.lamorel_args)
    # Do whatever you want with your LLM
    lm_server.close()
if __name__ == '__main__':
    main()

And I got this error: ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ in <cell line: 18>:22 │ │ in main:13 │ │ │ │ /content/lamorel/lamorel/src/lamorel/caller.py:19 in init │ │ │ │ 16 │ If the current process belongs to the LLM's processes, it will launch the LLM and wa │ │ 17 │ ''' │ │ 18 │ def init(self, config, custom_updater=None, custom_module_functions={}, custom_m │ │ ❱ 19 │ │ assert dist.is_initialized(), "torch distributed must be used!" │ │ 20 │ │ self.config = config │ │ 21 │ │ self.grad_fn_model = None │ │ 22 │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ AssertionError: torch distributed must be used!

I wonder if this is due to my environment or the config issues. Could you give me some advice?

ClementRomac commented 11 months ago

Hi,

Lamorel expects at least two processes: one for the LLM (server) and one for the RL (client). It uses torch distributed to make these two processes communicate.

In a colab ther is a single process running, explaining the error you get. Right now I do not see how to make it work in a colab as it is really designed for a distributed setup (even though you can locally run these two processes as in the examples), I am sorry :/

Jugg1er commented 11 months ago

Thank you for your reply! I'm not familiar with torch.distributed, and I will try to test it locally.

ClementRomac commented 11 months ago

The examples in the readme use our launcher to create the distributed setup given a config file so you shouldn't need to be comfortable with torch.distributed to use lamorel :)

Do not hesitate if you have further questions!