Closed cha-zhang closed 7 years ago
Looking forward to new examples for NLP. Thanks.
@meijiesky could you help to list what new examples of NLP that can be done in tensorflow but still with no examples in CNTK? Thank you
@JimSEOW I am interested in seeing examples on using reinforcement learning to generate dialogue responses and GAN to train the language model.
There are two reinforcement learning examples, CartPole and FlappingBird. In NLP, there are some differences.
Besides, I have seen the tutorial of GAN to generate images, the model is continuous because the image can be generated simply by using a softmax layer. However, in terms of NLP the model is discrete because we need to use the softmax and sample one token at each time step.
I really enjoy CNTK and I would truly appreciate if you can provide examples using RL or GAN to deal with NLP. Thank you.
PS: Beam search decoder is still not available and hope we can have beam search in Python someday.
Does point 2 under performance (Intel MKL update to improve inference speed on CPU by around 2x on AlexNet) refer to the MKL-DNN library integration or is the speed-up caused by an update of the MKL library to a newer version? I'm most eager to try the MKL-DNN integration. :)
@meijiesky FYI, I do not work for CNTK team. A few months back, some of the CNTK users decided to gather to consolidate our feedback for the need to have .NET modeling of CNTK networks beyond simply evaluation. This is happening FINALLY, step by step.
Another important milestone is to bring CNTK to UWP(x64bits) - done, UWP(ARM64, for both W10 on ARM64 and IoT Core Pro for ARM64). - WIP
Most important now for the CNTK developments is to identify
Please help the communities to lobby more to contribute TO NARROW the GAPs in use case between tensorflow and CNTK.
What CNTK team has shown is that THEY NOW RESPONSE in AGILE way!
@JimSEOW Thanks for your explanation. Personally, I think CNTK processes RNN networks mush faster than other DL frameworks and that's why I choose CNTK to do NLP. However, CNTK is a fresh new Tool compared to Tensorflow, and there are not as many resources in CNTK as in Tensorflow, making CNTK users sometimes have trouble resolving their problems or not able to find resources to refer to. If there is better documentation, more tutorials or examples and stronger performance, sooner or later Tensorflow users will switch to CNTK.
@e-thereal
Does point 2 under performance (Intel MKL update to improve inference speed on CPU by around 2x on AlexNet) refer to the MKL-DNN library integration or is the speed-up caused by an update of the MKL library to a newer version? I'm most eager to try the MKL-DNN integration. :)
A link at https://gitter.im/Microsoft/CNTK?at=5985a8d81c8697534a8e23d6 (https://www.nextplatform.com/2017/03/21/can-fpgas-beat-gpus-accelerating-next-generation-deep-learning/, though other links there might interest you also) that also discusses MKL-DNN, might be of interest.
@e-thereal Point 2 under performance is an update of the MKL library. This is our collaboration with Intel and they are not eager to move to MKL-DNN yet.
@cha-zhang Do you know if the MKL plans on using the NN "primitives" that are available in MKL 2017? I imagine this could result in performance improvements for inference on CNNs even without a full transition to MKL-DNN. Or is the main performance improvement coming from running GEMM with AVX 512 on Skylake-SP? (i.e., no performance improvements expected on older Intel processors.)
@bencherian This will be a refresh to MKL 2017.
=> Whenever it is possible, perhaps we could list down how the recent requests by users are related to up-coming interaction plan. This increases agile feedback and user engagement.
@JimSEOW
If I can add few (mexican) cents :)
Lately I am very interested in the field of reinforcement learning. And if you try to find a recent reproduction of a recent paper or even the actual code of the paper (eg, curiosity driven exploration, unreal, a3c, ga3c, paac, pedictron, etc), you most likely find a tensorflow implementation. Moreover, OpenAI has started to provide rl algorithms in tensorflow.
The community and Microsoft would need to step it up, and implement these models in CNTK. On top, blogs would need to be more active and show how the object oriented approach of cntk could make your live easier to implement, for example, the unreal methodology (sharing weights between 3 or 4 networks).
Happy to help
@pedronahum A Mexican with a banker job in day time while pushing data science's envelop will always have some interesting views.
=> First, CNTK has to take on the challenges in multiple fronts. However, we need to prioritize them while keep collecting feedback and suggestions and communicating at the same time that these suggestions are being "agile" not just from the CNTK team but as part of a consensus building with the community.
=> THIS IS the state CNTK is now in. THIS is the combination that will keep drawing talented users to come and feedback to drive CNTK development to the right directions.
=> NOW the main discussion How could we communicate MORE effectively [diagram, short presentation] to new users how the object oriented approach of CNTK have unique advantages compared to competitors?
We must NOT limit the discussion to just the technical aspects. This is what the CNTK team is very good at.
We need to mobilize users, not only like yourself, with financial background, but also a diverse background like marketing, sale, communication to interpret the unique advantages of CNTK.
=> In other words, CNTK is on good footing, we need to figure out the different strategies of "Re-branding" CNTK, in individualized way, for users of different backgrounds.
I can provide a NLP example on answer selection using the dataset from Microsoft Beauty of Programming 2017. The model can achieve 58% MRR(Mean Reciprocal Rank) on validation set. Would you like it? @cha-zhang
@Alan-Lee123 of course we would love to have your contribution! :)
@Alan-Lee123 what would be a good way to get your example into our repository. We can chat offline if you prefer. LMK.
@sayanpa I would love to share ideas with you.
@e-thereal @veikkoeeva @bencherian Apologize for some inaccurate information before. For this September iteration, it is a refresh for MKL, that part is accurate. However, we are aiming for MKL-DNN integration in the very near future, maybe in the next iteration or two.
Thanks for the update. Good to know and we're looking forward to the mkl-dnn integration.
Sent from my Windows 10 phone
From: Cha Zhang Sent: Mittwoch, 23. August 2017 11:45 To: Microsoft/CNTK Cc: e-thereal; Mention Subject: Re: [Microsoft/CNTK] Iteration Plan (August - September 2017) (#2194)
@e-thereal @veikkoeeva @bencherian Apologize for some inaccurate information before. For this September iteration, it is a refresh for MKL, that part is accurate. However, we are aiming for MKL-DNN integration in the very near future, maybe in the next iteration or two. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
Hi
Please prepare some contents that relates how to set up learning parameters of the network. I already read this one, but there some confusing parameters that need more description to understand. I recommend that prepare some cookbook for parameter mapping. For example using eta = 0.001
and momentum = 0.9
is so common in other toolkits. A cookbook table, mapping such parameters to CNTK version will be very helpful.
thanks
Hi, is there any plan about the cntk library evaluation with GPU in UWP? thanks.
Despite _"New example for natural language processing (NLP)"_Examples showing how to use Conditional Random Fields (CRF) would be great. I don't know if CNTK has plans to build a CRF layer.
@nono1981 Take your pick: https://github.com/Microsoft/CNTK/search?q=uwp&type=Issues&utf8=%E2%9C%93 :)
@nono1981 - please see issue #2243
@wolfma61 @veikkoeeva thank you all. :)
Some update. A few tasks for this iteration are blocked:
Apologies for these issues. The team recently has a reorg, which impacted the team's execution efficiency.
Is it too late to get a hello-world Logic Regression example to train and evaluate a model using the new C#/API into the next release? I did notice the other examples.
@whitmark : @liqunfu will give it a try.
Will a nuget package with the new C#/API and dependencies be included in the release (CPU/GPU)?
Yes, there will be NuGet package for the C# API.
ETA for v.2.2?
In July iteration plan you mentioned: "CNTK object-oriented C API." Is there any update on this?
We have debated and decided to use SWIG for C# API. So we are no longer creating C API at this moment.
@cha-zhang I'm curious what will be used for R bindings? I hope it will it be native support (NOT through python/reticulate).
Sorry to disappoint you, but the R-binding will be through reticulate.
This is very frustrating (and this even can't be called "bindings"). From Reasons to Switch from TensorFlow to CNTK:
This not only makes it extremely fast, but also allows it to be used as a C++ API ready to be integrated with any applications. It also makes it very easy to add additional bindings to CNTK, such as Python, R, Java, etc.
R lacks good integration with comprehensive deep-learning framework (the only natively supported is MXnet, but it is hard to call R's interface matured). Such addition of native support potentially could be really useful.
So what will be the reason to use CNTK R interface if similar one for tensorflow (and even CNTK via keras) exists?
This work is done by a team outside CNTK, and the choice of reticulate is to ensure we can have a R-binding as soon as possible. Reticulate seems to cause ~5% perf drop over CNTK Python API, and should still be much faster than TensorFlow's R-binding.
Hi @cha-zhang,
In the same token, have you run a similar performance test for the C# API? Thanks
If you use the minibatch source, speed of C# and Python are the same. If you feed data yourself, we are seeing some 30% slow down for C#. We are investigating the issue.
I'm closing this issue since v2.2 was shipped on Sep. 15. We will post a new iteration plan soon.
This plan captures our work from early August to mid September. We will ship around September 15th. Major work items of this iteration include Volta 16bit support and C#/.NET API. There will also be numerous other improvements we will make as detailed below.
Endgame
Planned items
We plan to ship these items at the end of this iteration.
Legend of annotations:
Documentation
System
✋ 16bit support for training on Volta GPU (limited functionality)
Examples
Operations
Performance
Keras and Tensorboard
Others