microsoft / CNTK

Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit
https://docs.microsoft.com/cognitive-toolkit/
Other
17.49k stars 4.3k forks source link

What's the CNTK successor? #3807

Open ivanfarkas2 opened 4 years ago

ivanfarkas2 commented 4 years ago

What's the CNTK successor?

CNTK became stale. It has not been updated for almost a year now.

fwaris commented 4 years ago

Microsoft is building up ONNX (https://onnx.ai/) as part of a consortium of companies.

Previously ONNX only supported inference but now it also supports training.

Currently the models have to be constructed in PyTorch but I can imagine a new API (maybe .Net based) that can be used to directly construct and train ONNX models.

ML.Net has ONNX support but the API is not very elegant (at least at this time) from what I looked at.

Its hard to predict the future but ONNX has a reasonable chance of long-term success. Google is not onboard but Facebook and AWS are. (You can export tensorflow models to ONNX, though).

ONNX also reduces reliance on NVidia GPUs which can be costly. Google has developed TPUs to reduce dependency on NVidia. ONNX is an open binary standard with support from multiple hardware vendors. Open standards tend to win out over time as they level the playing field and create a 'market' where many can participate.

ONNX is definitely an area to watch and investigate but its still too early to tell how it will all play out.

prasanthpul commented 3 years ago

(ONNX Runtime{[https://onnxruntime.ai] can inference models from PyTorch, TensorFlow, and other frameworks supporting ONNX. It's highly optimized to be fast and small and works across operating systems and hardware. It's used extensively at Microsoft and by others as well. ONNX Runtime now also supports training acceleration. You can integrate it into PyTorch for example and speed up training of large models quite significantly (see the website for details)

fwaris commented 3 years ago

@prasanthpul ONNX does speed up existing models. I converted a PyTorch, BERT-based transformer model to ONNX and saw significant speed up - at least for bulk scoring.

However the models are getting more complex and not all of them can be ported to ONNX. A case in point is TGN (temporal graph network) which requires Python code to sample graph nodes in addition to the core model invocation. I don't think such a model can be exported to ONNX (although I have not tried to do so).