Enhance support for GPU-resident data

Currently the user can pass references to GPU-resident data to a Function/TimeFunction:

https://github.com/devitocodes/devito/blob/master/devito/data/allocators.py#L340

However, when one wants to interface a Devito Operator with something else like PyTorch, which has GPU-resident arrays that need no copy from/to the host, it'd be way more convenient to also support the likes of CuPy arrays as well. This would allow the user to e.g. treat/initialize/modify u.data the exact same way, irrespective of whether the data is host- or GPU-resident. GPU-resident arrays are useful because it spare host memory, and in single-node multi-GPU nodes the save could be significant.

devitocodes / devito

Enhance support for GPU-resident data #2460