Change winograd dispatch condition - Githubissues

ml-explore / mlx

MLX: An array framework for Apple silicon

https://ml-explore.github.io/mlx/

MIT License

17.49k stars 1.01k forks source link

Change winograd dispatch condition #1534

Closed awni closed 4 weeks ago

awni commented 1 month ago

The condition is much better for small batch sizes across the board with minor speedup for larger batch sizes.

Resnet 18 benchmark on M2 Ultra:

Batch Size	Pre Milliseconds-per-image	Post Milliseconds-per-image
1	6.979	1.833
2	3.214	0.947
4	1.468	0.765
8	0.897	0.590
16	0.612	0.526
32	0.500	0.475
64	0.463	0.437

Same benchmark on M1 Max:

Batch Size	Pre Milliseconds-per-image	Post Milliseconds-per-image
1	19.664	2.644
2	15.996	1.783
4	6.358	1.578
8	4.592	1.409
16	2.927	1.314
32	2.101	1.247
64	1.699	1.649

Same benchmark on M3 Max:

Batch Size	Pre Milliseconds-per-image	Post Milliseconds-per-image
1	4.046	1.438
2	2.425	1.003
4	1.441	0.839
8	1.018	0.751
16	0.807	0.716
32	0.734	0.699
64	0.704	0.674