Why did you discard the official mish implementation and change it by setting a threshold?
Also official activation method is mainly using hardswish.
Does it work better than hardswish?
As for concatenating:
In YOLOv5 code, have three routes but not processed by CSP different with pytorch version.
Why did you discard the official mish implementation and change it by setting a threshold? Also official activation method is mainly using hardswish. Does it work better than hardswish?
As for concatenating: In YOLOv5 code, have three routes but not processed by CSP different with pytorch version.