Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, GraphRAG, DeepSpeed, Axolotl, etc
Apache License 2.0
6.72k
stars
1.26k
forks
source link
User need to pass different extra_params for local mode and distributed mode to init_orca_context #3894
Internally, in local mode, we convert '-' to '_' and pass the parameters to ray.init(). In distributed mode, we directly apply the params to ray start.
However, the parameters may not be the same for ray.init() and ray start, for example, --metrics-export-port in ray start and _metrics_export_port in ray.init; worker-port-list is only for ray start but not in ray.init()
Possible Solution
Solution 1: Use ray start for both local mode and distributed mode
Solution 2:
Identify the parameters starting with additional "_" in ray.init()
Raise warning instead of error for parameters of ray start only when running in local.
Related issue: #3867 #3891
Problem description
For some complicated ray parameters, users couldn't use the same
extra_params
for both the local mode and distributed mode.For example, users may need to pass
extra_params
as below to work in local,and as below for distributed.
Internally, in local mode, we convert '-' to '_' and pass the parameters to
ray.init()
. In distributed mode, we directly apply the params toray start
. However, the parameters may not be the same forray.init()
andray start
, for example,--metrics-export-port
inray start
and_metrics_export_port
inray.init
;worker-port-list
is only forray start
but not inray.init()
Possible Solution
Solution 1: Use ray start for both local mode and distributed mode Solution 2:
ray.init()
ray start
only when running in local.