intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, MiniCPM, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, GraphRAG, DeepSpeed, vLLM, FastChat, Axolotl, etc.
Apache License 2.0
6.58k stars 1.25k forks source link

unpersist method of DistributedDataSet lack unpersist of RDD #2785

Open Litchilitchy opened 5 years ago

Litchilitchy commented 5 years ago

https://github.com/intel-analytics/BigDL/blob/4a1126e1479528552e9c2c77a13d4b4414f36652/spark/dl/src/main/scala/com/intel/analytics/bigdl/dataset/DataSet.scala#L195

only transformer is unpersisted, if transform action is called and unpersist method will call this method, leaving RDD still in cache

should do originRDD().unpersist()

wzhongyuan commented 5 years ago

the transformer is from original rdd right ? I think the original will be unpersisted once transformer got unpersisted

Litchilitchy commented 5 years ago

the transformer returns a new anonymous DistributedDataset which override the unpersist() and actually this code does not unpersist the origin_RDD

jason-dai commented 4 years ago

Do we still need to fix this? @Litchilitchy