In this PR, we increase the spatial dimension of the C32 and D32 ports in GeMMX to 2 (previously it was 1).
With this, we add an extra stride in the spatial address generation for C32 and D32 (now can jump at 2 elements and row granularity), thus enabling more data layout opportunities for reducing the bank conflict.
@jorendumoulin @JosseVanDelm , please have a look at it, now we have two extra spatial strides CSRs. Please let me know if there is anything unclear.
In this PR, we increase the spatial dimension of the C32 and D32 ports in GeMMX to 2 (previously it was 1). With this, we add an extra stride in the spatial address generation for C32 and D32 (now can jump at 2 elements and row granularity), thus enabling more data layout opportunities for reducing the bank conflict.
@jorendumoulin @JosseVanDelm , please have a look at it, now we have two extra spatial strides CSRs. Please let me know if there is anything unclear.