AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.59k stars 7.95k forks source link

Why does yolov4-tiny use leaky instead of mish? #6178

Open 1027663760 opened 4 years ago

1027663760 commented 4 years ago

@AlexeyAB

WongKinYiu commented 4 years ago

yolov4-tiny is developed for both of cpu and gpu, exponential and log functions in mish is not friendly for cpu inference.

sealedtx commented 4 years ago

Can someone share pre-trained weights for yolov4-tiny with mish?

PallHaraldsson commented 4 years ago

Can someone clarify for me, as I'm new to this. I understand Mish is one of the best now, but since slow, it would also be (more so) for training on GPU (or CPU).

Pre-trained with Mish, would not work for inference with Leaky [ReLU, assuming]. My understanding is you could substitute one activation function for some other (to some degree); but they would need to match. Or maybe not...:

I'm looking into better activation functions, making my own, would you be interested in a better Mish, an approximation (or of similar function)? Could you train on a more accurate version and do inference with an approximate?

I'm not familiar enough with Yolo, 4 or 5 or any. Do the non-tiny variants use Mish or other, and only tiny Leaky ReLU?

My (unoptimized) Mish implementaion here: https://github.com/sylvaticus/BetaML.jl/pull/6#issuecomment-645529594

was 14.8x slower than (regular) ReLU. My PLU was however just as fast as ReLU (I only timed on CPU), so maybe a candidate?

hfassold commented 4 years ago

the "hard-mish" function (https://forums.fast.ai/t/hard-mish-activation-function/59238) is a runtime-efficient approximation of mish