This PR adds support for SYCL to the original Cutlass 2 example 35. The focus of this PR is only on functionality. Performance optimisations will be addressed in a separate PR to ensure that execution with SYCL can match the performance achieved with cuda.
This PR adds support for SYCL to the original Cutlass 2 example 35. The focus of this PR is only on functionality. Performance optimisations will be addressed in a separate PR to ensure that execution with SYCL can match the performance achieved with cuda.