nebuly-ai / optimate

A collection of libraries to optimise AI model performances
https://www.nebuly.com/
Apache License 2.0
8.37k stars 643 forks source link

APIs for non-Python programming languages #7

Open emanef13 opened 2 years ago

emanef13 commented 2 years ago

Is there a c++ api of the library?

diegofiori commented 2 years ago

We currently only support DL Python frameworks (Tensorflow and Pytorch).

We are considering extending the library to other programming languages such as Julia, Swift, and C++, however it will take some time to realize nebullvm's full vision for a programming language & hardware agnostic inference accelerator.

I have renamed this issue to "APIs for non-Python programming languages" in order to allow other community members to specify their preferences on APIs. That way, we will have a way to prioritize the development of programming languages.

isgursoy commented 1 year ago

My target environment is C++ and I dont think optimizing model in C++ would have any value in my development cycle. Portability matters. Normally I am exporting ONNX if I can and torchscript otherwise.

bzisl commented 1 year ago

I'm have the same issue than isgursoy. Is it possible to export the optimised network to Onnx?

bzisl commented 1 year ago

Basically the idea is to be able to import the optimised model into C++ (onnxruntime)

valeriosofi commented 1 year ago

Hi @bzisl, the optimised models are compiled and cannot be converted back to onnx, however it is possible to exclude all compilers except onnxruntime during the optimization(using the ignore_compilers parameter), so that you have an optimised model that is in fact an onnx. Keep in mind that in this way Speedster will only use onnxruntime and possibly quantization to speed up your model, so the results may not be as good as using all the compilers. After optimizing the model, you just have to save the optimized_model using the save_model() function, and you will get an onnx model.

bzisl commented 1 year ago

Thanks!

bzisl commented 1 year ago

One question more, please. optimized_model = speedster.optimize_model(onnx_path, input_data=input_data, optimization_time="unconstrained")

How we force optimisation for CPU?

Best regards!!!!

valeriosofi commented 1 year ago

You can use:

optimized_model = speedster.optimize_model(
    onnx_path, 
    input_data=input_data, 
    optimization_time="unconstrained", 
    device="cpu"
)

I can see that you are optimizing an onnx model, I would suggest that you enable quantization by setting metric_drop_ths=0.1 in the function.

bzisl commented 1 year ago

Thanks a lot!