Open fwaris opened 7 years ago
Definitely python API has lots of advantages over BS. The most important one for me is that I can deploy models on server side more easily. Although BS is an ideal choice for research, but in most cases there are equivalent way to implement ideas in python. So you most not be worry about BS future, unless you're a python hater!
Cross-referencing https://github.com/Microsoft/CNTK/issues/960#issuecomment-306207578 and https://github.com/Microsoft/CNTK/issues/960#issuecomment-307045047 to make F#/BrainScript consideration elsewhere more visible.
Nothing against Python however if you compare a brainscript model definition with an equivalent one in Python, the BS version is much more succinct; it's easier to grasp the core model as the math is much clearer.
The problem as I see it with BS is that it will get increasingly isolated as most would want an integrated system where data wrangling, model development and deployment can be easily automated. I can see Python providing that.
However, I would still miss the expressibility provided by BS. I believe that a level of expressibility close to BS can be achieved by using F# as the base modeling language (CNTK needs a better .Net story anyway given that it's from Microsoft).
One key aspect of F# is its metaprogramming capability; essentially you have the capability of getting an Abstract Syntax Tree (AST) for any function instead of compiled code, if that function is tagged as such. The AST can be processed at runtime, as need be, to build CNTK model structure and so forth. Alea framework is an example of low level GPU programming leveraging F# metaprogramming (there are many others)
The long-and-short of it is that it's easy to embed Domain Specific Languages (DSLs) in F#. There is broad flexibility - even to the point of embedding raw (but syntax checked) BS code in F# code, if need be
F# could be the gateway of integrating CNTK to the larger .Net ecosystem
I'll note @mathias-brandewinder specifically as he's the master of machine learning (and AI) and F#. :)
@fwaris we should chat :)
I am also worried about possible deprecation of BrainScript. Here's the sign I have seen in some recent issue: https://github.com/Microsoft/CNTK/issues/1711#issuecomment-314951000
One of the reasons of CNKT being favourite for our team is that it allows defining models and running whole model development process without any need for Python. BrainScript is a bit awkward, but it's much better in terms of model description, and it's great that it does not impose any restrictions on the development process (like requiring Python).
So we invested a lot of time in studying BS and now we are worried if it was in vain.
@mikhail-barg , we are publishing a C#/.NET API for CNTK very soon. Would that be sufficient for your needs?
BS was planned to be deprecated, although we are hearing people wanting to keep it. We are still debating inside the team...
A .Net API would be very welcome. However, I am somewhat concerned that a problem which is best expressed in a declarative manner will become needlessly complicated with an imperative API.
In BS, the core model is so much easier to grasp and reason about than in say Python, primarily because BS is a declarative, functional language.
I would encourage the CNTK team to involve the F# team as it would be possible to create a more expressive layer (perhaps on top of the .Net API) that brings the model expressibility close to BS.
F# is a functional language with declarative capabilities. I know the Don Syme the designer of F# at Microsoft would be interested in working with the CNTK team. And some of the community members (including myself) have talked about supporting such an effort.
I would suggested involving the F# team earlier rather than later so that the .Net API is compatible with declarative mode of usage.
We are designing a high level API for C#, which will hopefully make it as easy to use as Python high level API. We will share an API design with the community very soon, and we will collect feedback to further improve it.
I generally agree with @fwaris that BS model is easy to grasp because it's declarative. Though I don't think that switching to F# would do any good (and I don't see BS being functional).
I generally like the way things are right now — I can describe model in BS and do train/test/modify iterations as much as I like without any need for specific production framework/language (except for the data preparation, that's true). And when I'm happy with the model I could take it to the production environment and use it from C#.
It would be nice to have a built-in way for data preparation (right now we generally write ad hoc programs to convert data into CTF), but I pretty much like how the preparation code is separated from the model itself.
I'm not sure how C# high-level training API would look like, so I'm looking forward to see it. It would probably be nice in case you have a model update/reinforcement built-in in production process. So I feel like having C# API is great, but I'd like to have BS as is.
Thanks @cha-zhang. Looking forward to the .Net API.
Let me illustrate with a quick example of how F# - being functional and declarative - will help.
Lets say we want to specify a custom loss function (composed of primitives available in CNTK), e.g.:
f_loss (y_pred, y_true) =
Ideally we would like to plug this function directly into the model for minimization. In Keras, for example, you can use model.compile, as in:
model.compile(loss=f_loss), ...
I am not sure how the .Net API will be implemented but hopefully its not where one has use a set classes that need to be instantiated and linked together into an object graph. Anyone reading the code will not understand easily what the loss function is doing.
As with Python, F# can be configured to return an AST of the section of the code. The loss code can be normal F# code that is easy to understand but the AST version of it can be processed - by underlying tooling - into a computational graph for CNTK.
@mikhail-barg if there is an F# API then you should be able to use it in multiple ways:
a) as standalone script from the command line - much like using BS today (there is no explicit compile required)
b) Via a REPL. Just like in Python, you can interactively evaluate F# code by selecting and sending it for evaluation. This is the primary mode of using F# for data science type work.
c) Inside a compiled program like you would in C#. This would be useful for production type deployments where periodic model re-training is required.
ML folks collaborating with PL (programming language) folks would be a great meeting of the minds from which we all can benefit.
Thanks fwaris, pls consider including 3 more pieces to the ML ecosystem -Tensorboard-f# api already started by Michael Guero (open .Net author); -OpenCV tools for imaging enhancement/codecs and -QuantumComputing reintegration so F# is halfway in & it's DSL succinct code will be a joy over python but also prevalent libraries can be used again with F# Type Providers for Python.
We have invested a great deal of time in BrainScript, and given the severe lack of professional textbook documentation with detailed examples, deprecating it would alienate many of those devoted to Microsoft's Computational Network via BS.
I have been using brainscript to develop models. As a functional programmer brainscript makes perfect sense to me but I feel that brainscript is being deprecated in favor of the Python API.
Will brainscript be continued to be supported?
Also Microsoft has its own functional language F# with built-in meta programming capability that could work well for expressing DL models. Any thoughts on using F# as the primary API for CNTK?