oneapi-src / oneDAL

oneAPI Data Analytics Library (oneDAL)
https://software.intel.com/en-us/oneapi/onedal
Apache License 2.0
605 stars 210 forks source link

Enable generic enums with regression while using set/get functions #993

Closed Mightrider closed 3 years ago

Mightrider commented 3 years ago

I try to generalize your API and tried to use daal::algorithms::regression::training::data for each regression algorithm. Unfortunately this does not work as expected. So I need to specify it for each algorithm with the correct namespace as you can see in my example.

daal::algorithms::linear_regression::training::interface1::Batch<DaalAlgorithmType> linReg;
linReg.input.set(daal::algorithms::linear_regression::training::data, aFeatures);

daal::algorithms::gbt::regression::training::interface1::Batch<DaalAlgorithmType> gbtReg;
gbtReg.input.set(daal::algorithms::gbt::regression::training::data, aFeatures);

The reason why I expected it to work with the generic enum are:

Is there a way to use the more generic enums in the daal::algorithms::regression::training/prediction namespaces?

And for the gbt regression and decision forest there seems to be a typo as well.

https://github.com/oneapi-src/oneDAL/blob/412adbdc376a9a8cce9c72117bf69afe4ce27c64/cpp/daal/include/algorithms/gradient_boosted_trees/gbt_regression_training_types.h#L85

https://github.com/oneapi-src/oneDAL/blob/412adbdc376a9a8cce9c72117bf69afe4ce27c64/cpp/daal/include/algorithms/decision_forest/decision_forest_regression_training_types.h#L74

Thx in advance!

KalyanovD commented 3 years ago

Hello @Mightrider,

Thank you for a good question! During designing API of our algorithms we established the possibility to create its own Input class for each algorithm and as each algorithm may have specific input parameters we are creating its own enum with parameters. For regression algorithms we strictly follow this rule, but for classifiers we found a way to use generic Input class and enums as most of them have exactly same input parameters. For example, to extend range of supported input parameters for Classification Decision Tree we enabled its own InputId enum as it has specific input parameters, which don't supported by other algorithms.

If answering your question in short, we don’t provide APIs for using generic enums because it may cause security issues with inappropriate input, for example. But for the cases where we create equal inputs we decided to use common enums to reduce size of library and simplify usage of these algorithms.

Moreover, with creating new API for our algorithms we plan to use alternative way with creating inputs, so this problem might be solved in future with transition to new and more user-friendly API.

Concerning question with typos for InputId enums in GBT and Decision Forest to be honest I didn’t catch what exactly do you mean. Could you please concretize your question?

emmenlau commented 3 years ago

Dear @KalyanovD thanks a lot for the good and helpful support!

Concerning question with typos for InputId enums in GBT and Decision Forest to be honest I didn’t catch what exactly do you mean. Could you please concretize your question?

I hope I can help with this: The name dependentVariables is usually spelled with a plural s at the end, for most regression methods that we found. For example in: https://github.com/oneapi-src/oneDAL/blob/3310c7a9a41a2f0eb72082115c0db98484e4d58d/cpp/daal/include/algorithms/linear_model/linear_model_training_types.h#L56

Its very helpful for us if the names are very consistent because we do not need to code each regression / classifier individually. The only two exceptions we found are: https://github.com/oneapi-src/oneDAL/blob/412adbdc376a9a8cce9c72117bf69afe4ce27c64/cpp/daal/include/algorithms/gradient_boosted_trees/gbt_regression_training_types.h#L85 and https://github.com/oneapi-src/oneDAL/blob/412adbdc376a9a8cce9c72117bf69afe4ce27c64/cpp/daal/include/algorithms/decision_forest/decision_forest_regression_training_types.h#L74

Are these intentional, or could they be renamed to dependentVariables?

KalyanovD commented 3 years ago

Hello @emmenlau, @Mightrider,

Thank you for the great catch! You are absolutely right, this is a classic typo which made during creating interfaces for these two algorithms. How do these issues critical for you? Is it a kind of stopper or is it acceptable for some time for you?

Unfortunately, such kind of changes guarantee breaking of backward compatibility and lead to creating a new interface for these algorithms according to our requirement to keep backward compatibility. Moreover, we are currently focused on developing new interfaces for our algorithms which uses different way of managing setting for using algorithms. They might be more suitable for you in case if there is some inconsistency in old interfaces.

emmenlau commented 3 years ago

Thanks again @KalyanovD ! Its not a blocker issue for us right now, we will just avoid using these interfaces for the moment. I will look into the new C++ interfaces when they are ready, thanks a lot!

emmenlau commented 3 years ago

Feel free to close this issue, also in the name of @Mightrider who is currently on holidays :-)

KalyanovD commented 3 years ago

Thank you @emmenlau for a good feedback on our work! =) You can re-open this issue at any moment in case it will start blocking you. Also feel free to ask questions if any!