Contributing to CNTK through community effort

delzac commented 5 years ago

Hi all,

With the change in cadence in the mainline, i was thinking that it made sense to start a repo where the community can make contributions easily as i continue to see great potential in cntk despite the lack in popularity.

To start the ball rolling, I have open sourced all the work that i have done on cntk into a library so that its easily reusable. It contains many convenience and popular functions, layers and models that are currently not available in the mainline. Hopefully, with more people contributing to it, we can have something like what fastai is to pytorch.

Everything in the library is written in pure cntk python API, so there should be no compatibility issues. Hopefully with this library it will make your life a bit less painful when building models. :)

Below is the laundry list of components available in cntkx.

ops	Description
`scalar`	cast tensor to scalar (1,)
`cumsum`	Cumulative summation along axis
`upsample`	Upsample by 2x (for image)
`centre_crop`	Crop centre of image
`swish`	Activation
`mish`	Activation
`hardmax`	Activation
`erf`	Error function
`gelu`	Gaussian Error Linear Unit function
`gelu_fast`	fast approximation of Gaussian Error Linear Unit function
`sequence.pad`	Pad at start or end of sequence axis
`sequence.length`	length of sequence
`sequence.position`	position of every sequence element
`sequence.stride`	strides across sequential axis
`sequence.join`	joins two sequence along their sequential axis
`sequence.window`	creates non-overlapping window along the sequence axis
`sequence.reverse`	reverses the items along the dynamic sequence axis
`sequence.reduce_mean`	calculates the mean along the dynamic sequence axis
`random.sample`	Samples an unnormalised log probability distribution
`random.sample_top_k`	Samples from the top_k of an unnormalised log probability distribution
`batchmatmul`	Batch Matrix Multiplication on a static batch axis, similar to tf.matmul

Layers	Description
`QRNN`	Quasi-Recurrent Neural Network
`Recurrence`	With option to apply `VariationalDroppout`
`PyramidalBiRecurrence`	Pyramidal bi-directional recurrence
`VariationalDropout`	Single binary dropout mask for entire sequence
`SinusoidalPositionalEmbedding`	Non-learnable positional embedding (no max sequence length)
`PositionalEmbedding`	Learnable Positional Embedding (used in BERT)
`BertEmbeddings`	BERT Embeddings (word + token_type + positional)
`BertPooler`	Pooler used in BERT
`SpatialPyramidPooling`	Fixed pooled representation regardless of image input size
`GatedLinearUnit`	Gated Convolutional Neural Network
`ScaledDotProductAttention`	Attention used in BERT and Transformer (aka 'attention is all you need')
`MultiHeadAttention`	Attention used in BERT and Transformer (aka 'attention is all you need')
`GaussianWindowAttention`	Windowed attention instead of conventional attention where everything is attended at the same time
`SequentialMaxPooling`	Max pool across sequential axis and static axes
`SequentialAveragePooling`	Average pool across sequential axis and static axes
`vFSMN`	Vectorised Feedforward Sequential Memory Networks
`cFSMN`	Compact Feedforward Sequential Memory Networks
`BiRecurrence`	BiRecurrence recurrent layer with weight tying option to half parameter requirement

Blocks	Description
`WeightDroppedLSTM`	A form of regularised LSTM
`IndyLSTM`	A parameter efficient form of LSTM
`IndRNN`	a RNN with long memory and can be stacked deeply

Loss	Description
`gaussian_mdn_loss`	loss function when using Mixture density network
`focal_loss_with_softmax`	A kind of cross entropy that handles extreme class imbalance
`cross_entropy_with_softmax`	Added `label smoothing regularisation` in cross entropy with softmax

Models	Description
`VGG`	Image Classification
`UNET`	Semantic Segmentation
`Transformer`	Language Modelling
`MDN`	Mixture Density Networks

Pre-trained models	Description
`Bert`	Bidirectional Encoder Representations from Transformers
fwd_wt103.hdf5	The weight parameters of the fastai's pytorch model. To be used to initialise `PretrainedWikitext103LanguageModel`
fwd_wt103.cntk	The converted cntk model of fastai's pytorch model. To be used with `C.load_model`
fwd_wt103.onnx	The converted ONNX model of fastai's pytorch model.

Learners	Description
`CyclicalLearningRate`	a method to eliminate the need to find best value and schedule for learning rate
`RAdam`	a variant of `Adam` that doesn't require any warmup

Misc	Description
`CTCEncoder`	Helper class to convert data into a format acceptable for cntk's ctc implementation

arijit17 commented 5 years ago

“continue to see great potential in cntk despite the lack in popularity.”

Nice 👍 Hopefully CNTK team continues working on CNTK and bring in new features as well.

xgirones commented 5 years ago

"With the change in cadence in the mainline"

Is CNTK being less actively developed?

AllanYiin commented 5 years ago

Amazing works, I still believe cntk is a potential framework(and also my favorite one), I also have some extensions and examples in cntk,I will try my best to share for community.

My existing deep learning showcase for training is on github, all(most) show case have cntk, pytorch and tensorflow+keras

https://github.com/AllanYiin/DeepBelief_Course4_Examples?files=1

microsoft / CNTK

Contributing to CNTK through community effort #3595