As the MapStageOp is actually run on the host, we need to ensure that both the 'key' and 'indices' tensors are in host memory. This now mirrors what CUDA is doing.
Also fixes device name comparison map_stage_op_test, by changing the expected device name string, as the device name could be '/device:SYCL:0' or '/device:GPU:0'. The CUDA device name reported by test_util is in the form '/gpu:0', while the device name used in the nodes looks like '/device:GPU:0'. When running on CUDA we need to change the expected device name accordingly,
however this isn't a problem on SYCL.
As the MapStageOp is actually run on the host, we need to ensure that both the 'key' and 'indices' tensors are in host memory. This now mirrors what CUDA is doing.
Also fixes device name comparison map_stage_op_test, by changing the expected device name string, as the device name could be '/device:SYCL:0' or '/device:GPU:0'. The CUDA device name reported by test_util is in the form '/gpu:0', while the device name used in the nodes looks like '/device:GPU:0'. When running on CUDA we need to change the expected device name accordingly, however this isn't a problem on SYCL.