intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, MiniCPM, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, GraphRAG, DeepSpeed, vLLM, FastChat, Axolotl, etc.
Apache License 2.0
6.56k stars 1.25k forks source link

AutoTS bottle neck on yarn #6707

Open qiuxin2012 opened 1 year ago

qiuxin2012 commented 1 year ago

AutoTS on yarn: n_sampling = 200

yarn client mode: python lstm.py --cores 28 --num_workers 1 --cluster_mode yarn-client cost 81.61s. While local mode python lstm.py --cores 28 --num_workers 1 cost 29.58s. yarn-client cost 2.75X times than local mode. Deep into the executions, I found HDFS operation(ls, mkdir, put) cost a lots of CPU time. Each operation will open a new java process and do a single HDFS operation. autots

image

Originally posted by @qiuxin2012 in https://github.com/intel-analytics/BigDL/issues/6371#issuecomment-1322870974

qiuxin2012 commented 1 year ago

maybe we can support httpfs https://hadoop.apache.org/docs/stable/hadoop-hdfs-httpfs/index.html or rest api https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/WebHDFS.html