AutoTS bottle neck on yarn

intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, MiniCPM, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, GraphRAG, DeepSpeed, vLLM, FastChat, Axolotl, etc.

Apache License 2.0

6.56k stars 1.25k forks source link

AutoTS on yarn: n_sampling = 200

yarn client mode: python lstm.py --cores 28 --num_workers 1 --cluster_mode yarn-client cost 81.61s. While local mode python lstm.py --cores 28 --num_workers 1 cost 29.58s. yarn-client cost 2.75X times than local mode. Deep into the executions, I found HDFS operation(ls, mkdir, put) cost a lots of CPU time. Each operation will open a new java process and do a single HDFS operation. autots

Originally posted by @qiuxin2012 in https://github.com/intel-analytics/BigDL/issues/6371#issuecomment-1322870974

intel-analytics / ipex-llm

AutoTS bottle neck on yarn #6707