microsoft / CNTK

Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit
https://docs.microsoft.com/cognitive-toolkit/
Other
17.53k stars 4.28k forks source link

Contributing to CNTK through community effort #3595

Open delzac opened 5 years ago

delzac commented 5 years ago

Hi all,

With the change in cadence in the mainline, i was thinking that it made sense to start a repo where the community can make contributions easily as i continue to see great potential in cntk despite the lack in popularity.

To start the ball rolling, I have open sourced all the work that i have done on cntk into a library so that its easily reusable. It contains many convenience and popular functions, layers and models that are currently not available in the mainline. Hopefully, with more people contributing to it, we can have something like what fastai is to pytorch.

Everything in the library is written in pure cntk python API, so there should be no compatibility issues. Hopefully with this library it will make your life a bit less painful when building models. :)

Below is the laundry list of components available in cntkx.

ops Description
scalar cast tensor to scalar (1,)
cumsum Cumulative summation along axis
upsample Upsample by 2x (for image)
centre_crop Crop centre of image
swish Activation
mish Activation
hardmax Activation
erf Error function
gelu Gaussian Error Linear Unit function
gelu_fast fast approximation of Gaussian Error Linear Unit function
sequence.pad Pad at start or end of sequence axis
sequence.length length of sequence
sequence.position position of every sequence element
sequence.stride strides across sequential axis
sequence.join joins two sequence along their sequential axis
sequence.window creates non-overlapping window along the sequence axis
sequence.reverse reverses the items along the dynamic sequence axis
sequence.reduce_mean calculates the mean along the dynamic sequence axis
random.sample Samples an unnormalised log probability distribution
random.sample_top_k Samples from the top_k of an unnormalised log probability distribution
batchmatmul Batch Matrix Multiplication on a static batch axis, similar to tf.matmul
Layers Description
QRNN Quasi-Recurrent Neural Network
Recurrence With option to apply VariationalDroppout
PyramidalBiRecurrence Pyramidal bi-directional recurrence
VariationalDropout Single binary dropout mask for entire sequence
SinusoidalPositionalEmbedding Non-learnable positional embedding (no max sequence length)
PositionalEmbedding Learnable Positional Embedding (used in BERT)
BertEmbeddings BERT Embeddings (word + token_type + positional)
BertPooler Pooler used in BERT
SpatialPyramidPooling Fixed pooled representation regardless of image input size
GatedLinearUnit Gated Convolutional Neural Network
ScaledDotProductAttention Attention used in BERT and Transformer (aka 'attention is all you need')
MultiHeadAttention Attention used in BERT and Transformer (aka 'attention is all you need')
GaussianWindowAttention Windowed attention instead of conventional attention where everything is attended at the same time
SequentialMaxPooling Max pool across sequential axis and static axes
SequentialAveragePooling Average pool across sequential axis and static axes
vFSMN Vectorised Feedforward Sequential Memory Networks
cFSMN Compact Feedforward Sequential Memory Networks
BiRecurrence BiRecurrence recurrent layer with weight tying option to half parameter requirement
Blocks Description
WeightDroppedLSTM A form of regularised LSTM
IndyLSTM A parameter efficient form of LSTM
IndRNN a RNN with long memory and can be stacked deeply
Loss Description
gaussian_mdn_loss loss function when using Mixture density network
focal_loss_with_softmax A kind of cross entropy that handles extreme class imbalance
cross_entropy_with_softmax Added label smoothing regularisation in cross entropy with softmax
Models Description
VGG Image Classification
UNET Semantic Segmentation
Transformer Language Modelling
MDN Mixture Density Networks
Pre-trained models Description
Bert Bidirectional Encoder Representations from Transformers
fwd_wt103.hdf5 The weight parameters of the fastai's pytorch model. To be used to initialise PretrainedWikitext103LanguageModel
fwd_wt103.cntk The converted cntk model of fastai's pytorch model. To be used with C.load_model
fwd_wt103.onnx The converted ONNX model of fastai's pytorch model.
Learners Description
CyclicalLearningRate a method to eliminate the need to find best value and schedule for learning rate
RAdam a variant of Adam that doesn't require any warmup
Misc Description
CTCEncoder Helper class to convert data into a format acceptable for cntk's ctc implementation
arijit17 commented 5 years ago

“continue to see great potential in cntk despite the lack in popularity.”

Nice 👍 Hopefully CNTK team continues working on CNTK and bring in new features as well.

xgirones commented 5 years ago

"With the change in cadence in the mainline"

Is CNTK being less actively developed?

AllanYiin commented 5 years ago

Amazing works, I still believe cntk is a potential framework(and also my favorite one), I also have some extensions and examples in cntk,I will try my best to share for community.

My existing deep learning showcase for training is on github, all(most) show case have cntk, pytorch and tensorflow+keras

https://github.com/AllanYiin/DeepBelief_Course4_Examples?files=1