With the Coyote implementation, copy seems not to work unless the src and dst buffers are both synced before invoking the copy() call, via sync_to_device(). In particular, it happens that the content for a float32 dst buffer after copy appears full of 0s, so no operation performed on it, unless the sync_to_device() is performed before and after the copy() call.
With the Coyote implementation, copy seems not to work unless the src and dst buffers are both synced before invoking the copy() call, via sync_to_device(). In particular, it happens that the content for a float32 dst buffer after copy appears full of 0s, so no operation performed on it, unless the sync_to_device() is performed before and after the copy() call.