intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, GraphRAG, DeepSpeed, Axolotl, etc
Apache License 2.0
6.68k stars 1.26k forks source link

Umbrella issue for Chronos refactor to enable customized installation #3170

Open TheaperDeng opened 3 years ago

TheaperDeng commented 3 years ago

This issue is a detailed plan to realize issue intel-analytics/analytics-zoo-internal#107 and decoupling the major function of Chronos with other BigDL components.

@shane-huang @yushan111

Overall design strategy [edited after discussion with Jason]

Here is a full functionality table (will be updated)

installation status installation cmd Chronos nano orca TSDataset XShardsTSDataset F/D/S*(distributed=False) F/D/S*(distributed=True) ONNX Auto Model AutoTSEstimator TSPipeline
chronos with nano** pip install bigdl-chronos; pip install bigdl-nano ✔(Pseudo distributed)
full chronos pip install bigdl-chronos[all]

* F/D/S means Forecaster/Detector/Simulators

** This two version can be used as light-weighted inference install strategy.

To complete these, we mainly need these steps(can be done simultaneously)

jason-dai commented 3 years ago

1) chronos should always install nano as its dependency (and use it for single node acceleration when apprioriate) 2) Installation/deployment of chronos should have 4 options: with or without orca X with or without ray 3) chronos should use nano API instead of pytorch-lightning; this functionality is however orthogonal to the installation/deployment requirement here

TheaperDeng commented 3 years ago
  1. chronos should always install nano as its dependency (and use it for single node acceleration when apprioriate)
  2. Installation/deployment of chronos should have 4 options: with or without orca X with or without ray
  3. chronos should use nano API instead of pytorch-lightning; this functionality is however orthogonal to the installation/deployment requirement here
  1. Nothing will happen when user only install ray but not orca, chronos can do nothing theoretically since AutoTS rely on orca.automl, XshardsTSDataset rely on orca.data and distributed training rely on orca.learn.
jason-dai commented 3 years ago
  1. chronos should always install nano as its dependency (and use it for single node acceleration when apprioriate)
  2. Installation/deployment of chronos should have 4 options: with or without orca X with or without ray
  3. chronos should use nano API instead of pytorch-lightning; this functionality is however orthogonal to the installation/deployment requirement here
  1. Nothing will happen when user only install ray but not orca, chronos can do nothing theoretically since AutoTS rely on orca.automl, XshardsTSDataset rely on orca.data and distributed training rely on orca.learn.

So you have three options: 1) nano only 2) nano & orca 3) nano & orca & ray

TheaperDeng commented 3 years ago
  1. chronos should always install nano as its dependency (and use it for single node acceleration when apprioriate)
  2. Installation/deployment of chronos should have 4 options: with or without orca X with or without ray
  3. chronos should use nano API instead of pytorch-lightning; this functionality is however orthogonal to the installation/deployment requirement here
  1. Nothing will happen when user only install ray but not orca, chronos can do nothing theoretically since AutoTS rely on orca.automl, XshardsTSDataset rely on orca.data and distributed training rely on orca.learn.

So you have three options:

  1. nano only
  2. nano & orca
  3. nano & orca & ray

yep, that looks good.

shanyu-sys commented 3 years ago

Note that nano hasn't been fully supported yet.

For the first release of BigDL-2.0, we will only include one extra install option all which will install all Chronos dependencies (PR intel-analytics/analytics-zoo#5033)

We will support other install options after the corresponding code is ready.

shanyu-sys commented 3 years ago

Shall we also consider dependencies like TensorFlow and PyTorch? Since Chronos contains both TensorFlow models and PyTorch models, and we might add more Tensorflow models in the future.

I extended the table above, detailed all Chronos components and the corresponding dependencies regarding Orca, Ray, TF, Pytorch. Component Orca Ray Tensorflow Pytorch Notes
TSDataset
TSDataset (distributed)
Forecaster (LSTM, S2S, TCN, TCMF) Issue intel-analytics/analytics-zoo#5006
Forecaster (distributed, backend=ray) Supported pytorch only
Forecaster (distributed, backend!=ray) Supported pytorch only
Forecaster (MTNet) issue intel-analytics/analytics-zoo#5037
Forecaster (Arima, Prophet) issue intel-analytics/analytics-zoo#5037
Detector
Simulator
AutoTSEstimator Supported pytorch only
TSPipeline Issue intel-analytics/analytics-zoo#5006 intel-analytics/analytics-zoo#5007

Therefore, with Tensorflow and Pytorch considered, the dependency options might be

In the future, we may support distributed training or tuning with Tensorflow-based model, then we may add:

TheaperDeng commented 3 years ago

small update: detector also rely on some pytorch models

shane-huang commented 3 years ago

Seperating pytorch and tensorflow installation seems only a valid request for inference with size concerns. And it seems to me the option should be with nano instead of chronos (e.g. seperate libs installation such as pytorch-lightening, IPEX, intel-tensorflow, etc. ). So the install options can be simplified to train/nano[pytorch]/nano[tensorflow]. Further, as our pytorch support is much better and tensorflow layer is thin (correct me if i'm wrong), we can simplify it to nano (both py+tf) & [nano+tensorflow]. So the options could be changed to as below:

shane-huang commented 3 years ago

Note that nano hasn't been fully supported yet.

For the first release of BigDL-2.0, we will only include one extra install option all which will install all Chronos dependencies (PR intel-analytics/analytics-zoo#5033)

We will support other install options after the corresponding code is ready.

We should expect new customers to use our upcoming release. What's the target usage for w and w/o all options? We may have to explain it to our customers. The w/o all option equals to nano only? If that is correct, the functions in this installation is quite limited:

We might need to consider our target usage when defining the options and which modules to include.

jason-dai commented 3 years ago

How about:

TheaperDeng commented 3 years ago

I prefer to simplify them to 3 options

And once our tf support is enhanced, we may change the options. currently, if users want to use tf, we can let them install tf themselves now.

The other reason is that tf1 is conflict with pytorch-lightning on some dependencies which may cause "dependency hell".

nano+orca+pytorch will only support 2 more features than nano+pytorch

jason-dai commented 3 years ago

I prefer to simplify them to 3 options

  • chronos: nano+pytorch
  • chronos[distributed]: nano+orca+pytorch
  • chronos[all]: nano+orca+ray+pytorch

And once our tf support is enhanced, we may change the options. currently, if users want to use tf, we can let them install tf themselves now.

The other reason is that tf1 is conflict with pytorch-lightning on some dependencies which may cause "dependency hell".

nano+orca+pytorch will only support 2 more features than nano+pytorch

  • XShardsTSDataset (experimental)
  • Distributed Training (w/o ray backend, while ray backend is our recommeneded&default backend)

The default should be distributed; a light version can be single node only.

TheaperDeng commented 3 years ago

bigdl-chronos: nano+pytorch+orca bigdl-chronos[lite]: nano+pytorch bigdl-chronos[all]: nano+orca+ray+pytorch

And once our tf support is enhanced, we may change the options. Currently, if users want to use tf, we can let them install tf themselves now.

TheaperDeng commented 3 years ago

Note that nano hasn't been fully supported yet. For the first release of BigDL-2.0, we will only include one extra install option all which will install all Chronos dependencies (PR intel-analytics/analytics-zoo#5033) We will support other install options after the corresponding code is ready.

We should expect new customers to use our upcoming release. What's the target usage for w and w/o all options? We may have to explain it to our customers. The w/o all option equals to nano only? If that is correct, the functions in this installation is quite limited:

  • no training or inference for DL-based forecasters (models are inherited from orca.automl.BaseModel)
  • no preprocessing (tsdataset depends on orca.data)
  • only a few ML-based model and detectors can be used.

We might need to consider our target usage when defining the options and which modules to include.

preprocessing is available since TSDataset is based on single node pandas while only XShardsTSDataset is based on orca.data.

Still, we should not have this nearly-useless option for this release. So we may simply let the [all] option to be our default option for this release early next week.

I am not sure if some of our customers require a lighter version for this release? And of course we always have the nightly built version later.

shanyu-sys commented 3 years ago

For this release, chronos default dependency (pip install bigdl-chronos) is with bigdl-orca. We haven't done the issues mentioned before to enable the default chronos run without orca, including Forecasters dependencies on orca.automl.BaseModel and orca.automl.metrics, TSPipeline evaluation with orca.automl.metrics.

So with the default chronos, we could support TSDataset, Forecasters, Simulators, Detectors, TSPipeline. Note that users may need to manually install pytorch or tensorflow for the corresponding component they want to use. I could also add pytorch as the default dependency.

With extra [all] option, users could enable the distributed functions, including distributed tuning with AutoTS, distributed forecasting, distributed dataset.

TheaperDeng commented 3 years ago

For this release, chronos default dependency (pip install bigdl-chronos) is with bigdl-orca. We haven't done the issues mentioned before to enable the default chronos run without orca, including Forecasters dependencies on orca.automl.BaseModel and orca.automl.metrics, TSPipeline evaluation with orca.automl.metrics.

So with the default chronos, we could support TSDataset, Forecasters, Simulators, Detectors, TSPipeline. Note that users may need to manually install pytorch or tensorflow for the corresponding component they want to use. I could also add pytorch as the default dependency.

With extra [all] option, users could enable the distributed functions, including distributed tuning with AutoTS, distributed forecasting, distributed dataset.

confirmed, thx. This will also be reflected in our user guide.