Closed maddyscientist closed 13 years ago
Ok, I've added support, it is enabled through macros defined at the top of dslash_quda.cu.: DIRECT_ACCESS_WILSON_SPINOR - the neighbouring spinor loads DIRECT_ACCESS_WILSON_ACCUM - the x load in Xpay kernels DIRECT_ACCESS_LINK - the gauge field Defining these macros enables direct reading (through L1 on Fermi), otherwise the texture cache is used.
Unfortunately, this only seems to decrease performance on my 480, so I've left the default as textured reads throughout.
Similar to what Guochun added for the staggered kernels. This should have the option to perform non-texture reads (i.e., through the L1 on Fermi) on the gauge fields and/or the spinor fields.