deepjavalibrary / djl

An Engine-Agnostic Deep Learning Framework in Java
https://djl.ai
Apache License 2.0
4.12k stars 655 forks source link

Feature Request: Add Mish Activation #16

Closed digantamisra98 closed 4 years ago

digantamisra98 commented 4 years ago

Mish is a novel activation function proposed in this paper. It has shown promising results so far and has been adopted in several packages including:

TensorFlow-Addons SpaCy (Tok2Vec Layer) Thinc - SpaCy's official NLP based ML library
Eclipse's deeplearning4j Hasktorch Echo AI
CNTKX - Extension of Microsoft's CNTK FastAI-Dev Darknet
Yolov3 BeeDNN - Library in C++ Gen-EfficientNet-PyTorch
dnet ruby-dnn blackcat-tensors
DL4S HuggingFace Transformers PAGI
OpenCV Odin-AI Mini DNN
Efficient Segmentation Networks

All benchmarks, analysis and links to official package implementations can be found in this repository

Mish also was recently used for a submission on the Stanford DAWN Cifar-10 Training Time Benchmark where it obtained 94% accuracy in just 10.7 seconds which is the current best score on 4 GPU and second fastest overall. Additionally, Mish has shown to improve convergence rate by requiring less epochs. Reference -

0 (2)

Mish also has shown consistent improved ImageNet scores and is more robust. Reference -

0

Additional ImageNet benchmarks along with Network architectures and weights are avilable on my repository.

Summary of Vision related results:

Capture

It would be nice to have Mish as an option within the activation function group.

This is the comparison of Mish with other conventional activation functions in a SEResNet-50 for CIFAR-10: se50_1

keerthanvasist commented 4 years ago

Thank you for the feature request @digantamisra98. We will implement the Mish activation function soon.

digantamisra98 commented 4 years ago

@keerthanvasist Thank you for considering the request.

digantamisra98 commented 4 years ago

Noticed, that Mish has been recently committed to DJL - https://github.com/awslabs/djl/commit/b06feb110e5100db2f839c80a8aca03ba0ecbdcf Closing this issue then. Thanks for adding Mish.