meta-llama / llama-stack-apps

Agentic components of the Llama Stack APIs
MIT License
3.89k stars 559 forks source link

RuntimeError: Distributed package doesn't have NCCL built in (on mac pro M1) #17

Closed Chrecci closed 3 months ago

Chrecci commented 3 months ago

The version_base parameter is not specified. Please specify a compatability version level, or None. Will assume defaults for version 1.1 initialize(config_path=relative_path) Loading config from : /x/x/.llama/configs/inference.yaml Yaml config:

inference_config: impl_config: impl_type: inline checkpoint_config: checkpoint: checkpoint_type: pytorch checkpoint_dir: /.llama/checkpoints/Meta-Llama-3.1-8B-Instruct/ tokenizer_path: /.llama/checkpoints/Meta-Llama-3.1-8B-Instruct/tokenizer.model model_parallel_size: 1 quantization_format: bf16 quantization: null torch_seed: null max_seq_len: 16384 max_batch_size: 1

Listening on :::5000 INFO: Started server process [74351] INFO: Waiting for application startup. W0725 17:29:07.226000 7904910400 torch/distributed/elastic/multiprocessing/redirects.py:28] NOTE: Redirects are currently not supported in Windows or MacOs. ... File "/llama-agentic-system/venv_3_10/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

worker_process_entrypoint FAILED

Failures:

Root Cause (first observed failure): [0]: time : 2024-07-25_17:29:07 host : -mbp.attlocal.net rank : 0 (local_rank: 0) exitcode : 1 (pid: 74376) ... File "/llama-agentic-system/venv_3_10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py", line 1573, in _new_process_group_helper raise RuntimeError("Distributed package doesn't have NCCL built in") RuntimeError: Distributed package doesn't have NCCL built in
dltn commented 3 months ago

Thanks for the report, @Chrecci! We're unfortunately only support CUDA right now – I'll add a note to the README and add Apple Silicon support to our team's backlog.

In the interim, you can set up the inference server on a remote machine with CUDA, and then run agentic systems on the Mac.