Closed amatus closed 11 years ago
Thanks, pulled.
For this particular small 32-bit transfer, I remember that I checked Cell Broadband Engine Programming Handbook and it said the following about alignment:
For transfer sizes less than 16 bytes, the MFC Effective Address Low must be naturally aligned (bits 28 through 31 must provide natural alignment based on the transfer size). For transfer sizes of 16 bytes or greater, the MFC Effective Address Low must be aligned to at least a 16-byte boundary (bits 28 through 31 must be ‘0’).
but unfortunately I did not pay attention to:
If the LSA is unaligned, MFC command queue processing is suspended, and an MFC DMA alignment interrupt is raised to the PPE. To be considered aligned, the four least significant bits of the LS address must match the least-significant four bits of the effective address (MFC Effective Address Low or List Address Channel (see page 462)).
As the "int work_restart" variable still uses the whole 128-bit vector in SPU and is aligned at 16 bytes there, the allocated array also must be aligned at 16 bytes on the PPU side. The padding to 128 bytes is used to make sure that each array entry resides in a different cache line.
This fixes a bug on my system where mfc_get was failing on work_restart_pointer because it was not 128-byte aligned.