Open TheaperDeng opened 3 years ago
1) chronos
should always install nano
as its dependency (and use it for single node acceleration when apprioriate)
2) Installation/deployment of chronos
should have 4 options: with or without orca
X with or without ray
3) chronos
should use nano
API instead of pytorch-lightning
; this functionality is however orthogonal to the installation/deployment requirement here
chronos
should always installnano
as its dependency (and use it for single node acceleration when apprioriate)- Installation/deployment of
chronos
should have 4 options: with or withoutorca
X with or withoutray
chronos
should usenano
API instead ofpytorch-lightning
; this functionality is however orthogonal to the installation/deployment requirement here
ray
but not orca
, chronos
can do nothing theoretically since AutoTS rely on orca.automl
, XshardsTSDataset rely on orca.data
and distributed training rely on orca.learn
.
chronos
should always installnano
as its dependency (and use it for single node acceleration when apprioriate)- Installation/deployment of
chronos
should have 4 options: with or withoutorca
X with or withoutray
chronos
should usenano
API instead ofpytorch-lightning
; this functionality is however orthogonal to the installation/deployment requirement here
- ✔
- Nothing will happen when user only install
ray
but notorca
,chronos
can do nothing theoretically since AutoTS rely onorca.automl
, XshardsTSDataset rely onorca.data
and distributed training rely onorca.learn
.- ✔
So you have three options:
1) nano
only
2) nano
& orca
3) nano
& orca
& ray
chronos
should always installnano
as its dependency (and use it for single node acceleration when apprioriate)- Installation/deployment of
chronos
should have 4 options: with or withoutorca
X with or withoutray
chronos
should usenano
API instead ofpytorch-lightning
; this functionality is however orthogonal to the installation/deployment requirement here
- ✔
- Nothing will happen when user only install
ray
but notorca
,chronos
can do nothing theoretically since AutoTS rely onorca.automl
, XshardsTSDataset rely onorca.data
and distributed training rely onorca.learn
.- ✔
So you have three options:
nano
onlynano
&orca
nano
&orca
&ray
yep, that looks good.
Note that nano
hasn't been fully supported yet.
For the first release of BigDL-2.0
, we will only include one extra install option all
which will install all Chronos dependencies (PR intel-analytics/analytics-zoo#5033)
We will support other install options after the corresponding code is ready.
Shall we also consider dependencies like TensorFlow and PyTorch? Since Chronos contains both TensorFlow models and PyTorch models, and we might add more Tensorflow models in the future.
I extended the table above, detailed all Chronos components and the corresponding dependencies regarding Orca, Ray, TF, Pytorch. | Component | Orca | Ray | Tensorflow | Pytorch | Notes |
---|---|---|---|---|---|---|
TSDataset | ❌ | ❌ | ❌ | ❌ | ||
TSDataset (distributed) | ✔ | ❌ | ❌ | ❌ | ||
Forecaster (LSTM, S2S, TCN, TCMF) | ❌ | ❌ | ❌ | ✔ | Issue intel-analytics/analytics-zoo#5006 | |
Forecaster (distributed, backend=ray) | ✔ | ✔ | ❌ | ✔ | Supported pytorch only | |
Forecaster (distributed, backend!=ray) | ✔ | ❌ | ❌ | ✔ | Supported pytorch only | |
Forecaster (MTNet) | ❌ | ❌ | ✔ | ❌ | issue intel-analytics/analytics-zoo#5037 | |
Forecaster (Arima, Prophet) | ❌ | ❌ | ❌ | ❌ | issue intel-analytics/analytics-zoo#5037 | |
Detector | ❌ | ❌ | ✔ | ✔ | ||
Simulator | ❌ | ❌ | ❌ | ✔ | ||
AutoTSEstimator | ✔ | ✔ | ❌ | ✔ | Supported pytorch only | |
TSPipeline | ❌ | ❌ | ❌ | ✔ | Issue intel-analytics/analytics-zoo#5006 intel-analytics/analytics-zoo#5007 |
Therefore, with Tensorflow and Pytorch considered, the dependency options might be
nano
& pytorch
: Single node PyTorch-based Forecaster
and Simulator
, AutoTS
inference nano
& tensorflow
: Single node Tensorflow-based Forecaster
and Detector
nano
& orca
& pytorch
: Distributed PyTorch-based Forecaster
without Ray backend, distributed TSDataset
nano
& orca
& ray
& pytorch
: Distributed PyTorch-based Forecaster
with Ray backend, AutoTS
trainingForecaster
In the future, we may support distributed training or tuning with Tensorflow-based model, then we may add:
nano
& orca
& tensorflow
: Distributed TF-based Forecaster
without Ray backendnano
& orca
& ray
& tensorflow
: Distributed TF-based Forecaster
with Ray backend, AutoTS
trainingsmall update: detector also rely on some pytorch models
Seperating pytorch and tensorflow installation seems only a valid request for inference with size concerns. And it seems to me the option should be with nano instead of chronos (e.g. seperate libs installation such as pytorch-lightening, IPEX, intel-tensorflow, etc. ). So the install options can be simplified to train/nano[pytorch]/nano[tensorflow]. Further, as our pytorch support is much better and tensorflow layer is thin (correct me if i'm wrong), we can simplify it to nano
(both py+tf) & [nano+tensorflow]
. So the options could be changed to as below:
nano
nano (tensorflow-only)
nano + orca
nano + orca + ray
Note that
nano
hasn't been fully supported yet.For the first release of
BigDL-2.0
, we will only include one extra install optionall
which will install all Chronos dependencies (PR intel-analytics/analytics-zoo#5033)We will support other install options after the corresponding code is ready.
We should expect new customers to use our upcoming release. What's the target usage for w and w/o all
options? We may have to explain it to our customers. The w/o all
option equals to nano only
? If that is correct, the functions in this installation is quite limited:
We might need to consider our target usage when defining the options and which modules to include.
How about:
I prefer to simplify them to 3 options
And once our tf support is enhanced, we may change the options. currently, if users want to use tf, we can let them install tf themselves now.
The other reason is that tf1 is conflict with pytorch-lightning on some dependencies which may cause "dependency hell".
nano+orca+pytorch will only support 2 more features than nano+pytorch
I prefer to simplify them to 3 options
- chronos: nano+pytorch
- chronos[distributed]: nano+orca+pytorch
- chronos[all]: nano+orca+ray+pytorch
And once our tf support is enhanced, we may change the options. currently, if users want to use tf, we can let them install tf themselves now.
The other reason is that tf1 is conflict with pytorch-lightning on some dependencies which may cause "dependency hell".
nano+orca+pytorch will only support 2 more features than nano+pytorch
- XShardsTSDataset (experimental)
- Distributed Training (w/o ray backend, while ray backend is our recommeneded&default backend)
The default should be distributed; a light version can be single node only.
bigdl-chronos
: nano+pytorch+orca
bigdl-chronos[lite]
: nano+pytorch
bigdl-chronos[all]
: nano+orca+ray+pytorch
And once our tf support is enhanced, we may change the options. Currently, if users want to use tf, we can let them install tf themselves now.
Note that
nano
hasn't been fully supported yet. For the first release ofBigDL-2.0
, we will only include one extra install optionall
which will install all Chronos dependencies (PR intel-analytics/analytics-zoo#5033) We will support other install options after the corresponding code is ready.We should expect new customers to use our upcoming release. What's the target usage for w and w/o
all
options? We may have to explain it to our customers. The w/oall
option equals tonano only
? If that is correct, the functions in this installation is quite limited:
- no training or inference for DL-based forecasters (models are inherited from orca.automl.BaseModel)
- no preprocessing (tsdataset depends on orca.data)
- only a few ML-based model and detectors can be used.
We might need to consider our target usage when defining the options and which modules to include.
preprocessing is available since TSDataset is based on single node pandas while only XShardsTSDataset is based on orca.data.
Still, we should not have this nearly-useless option for this release. So we may simply let the [all]
option to be our default option for this release early next week.
I am not sure if some of our customers require a lighter version for this release? And of course we always have the nightly built version later.
For this release, chronos
default dependency (pip install bigdl-chronos
) is with bigdl-orca
. We haven't done the issues mentioned before to enable the default chronos
run without orca
, including Forecasters
dependencies on orca.automl.BaseModel
and orca.automl.metrics
, TSPipeline
evaluation with orca.automl.metrics
.
So with the default chronos
, we could support TSDataset
, Forecasters
, Simulators
, Detectors
, TSPipeline
. Note that users may need to manually install pytorch
or tensorflow
for the corresponding component they want to use. I could also add pytorch
as the default dependency.
With extra [all] option, users could enable the distributed functions, including distributed tuning with AutoTS, distributed forecasting, distributed dataset.
For this release,
chronos
default dependency (pip install bigdl-chronos
) is withbigdl-orca
. We haven't done the issues mentioned before to enable the defaultchronos
run withoutorca
, includingForecasters
dependencies onorca.automl.BaseModel
andorca.automl.metrics
,TSPipeline
evaluation withorca.automl.metrics
.So with the default
chronos
, we could supportTSDataset
,Forecasters
,Simulators
,Detectors
,TSPipeline
. Note that users may need to manually installpytorch
ortensorflow
for the corresponding component they want to use. I could also addpytorch
as the default dependency.With extra [all] option, users could enable the distributed functions, including distributed tuning with AutoTS, distributed forecasting, distributed dataset.
confirmed, thx. This will also be reflected in our user guide.
This issue is a detailed plan to realize issue intel-analytics/analytics-zoo-internal#107 and decoupling the major function of Chronos with other BigDL components.
@shane-huang @yushan111
Overall design strategy [edited after discussion with Jason]
Chronos
should stick tonano
for single node acceleration when appropriate and be self-contained and able to complete most of its functionalities without any other dependencies (ray
/orca
).Chronos
will rely onorca
/ray
for functionalities with distribution fashion (will be reflected in the following tabel).Chronos
will contain a light-weighted inference installation strategy. (maybe not a new whl)tensorflow
is not my first priority since tf2 is intel's AI strategy whileChronos
has no tf2 model right now.Here is a full functionality table (will be updated)
Chronos
nano
orca
pip install bigdl-chronos
;pip install bigdl-nano
pip install bigdl-chronos[all]
* F/D/S means Forecaster/Detector/Simulators
** This two version can be used as light-weighted inference install strategy.
To complete these, we mainly need these steps(can be done simultaneously)