AtheMathmo / rusty-machine

Machine Learning library for Rust
https://crates.io/crates/rusty-machine/
MIT License
1.25k stars 153 forks source link

Feature Request: Add Mish activation #212

Closed digantamisra98 closed 3 years ago

digantamisra98 commented 4 years ago

Mish is a novel activation function proposed in this paper. It has shown promising results so far and has been adopted in several packages including:

TensorFlow-Addons SpaCy (Tok2Vec Layer) Thinc - SpaCy's official NLP based ML library
Eclipse's deeplearning4j Hasktorch Echo AI
CNTKX - Extension of Microsoft's CNTK FastAI-Dev Darknet
Yolov3 BeeDNN - Library in C++ Gen-EfficientNet-PyTorch
dnet ruby-dnn blackcat-tensors
DL4S HuggingFace Transformers PAGI
OpenCV Odin-AI Mini DNN
Efficient Segmentation Networks TF Semantic Segmentation Dynastes
DLib Copernicus AllenNLP
PyWick

All benchmarks, analysis and links to official package implementations can be found in this repository

Mish also was recently used for a submission on the Stanford DAWN Cifar-10 Training Time Benchmark where it obtained 94% accuracy in just 10.7 seconds which is the current best score on 4 GPU and second fastest overall. Additionally, Mish has shown to improve convergence rate by requiring less epochs. Reference -

0 (2)

Mish also has shown consistent improved ImageNet scores and is more robust. Reference -

0

Additional ImageNet benchmarks along with Network architectures and weights are avilable on my repository.

Summary of Vision related results:

Capture

It would be nice to have Mish as an option within the activation function group.

This is the comparison of Mish with other conventional activation functions in a SEResNet-50 for CIFAR-10: se50_1