Error copying Cuda Array to Array when using GPU

Hi! First I'll thank you all for this amazing module that I just discovered and enjoy so much. I am very new to Julia and started to play around with some easy GPU computing that offers parallel_stencil.jl, but came across error when running one of the examples provided (examples/diffusion3D_multigpucpu_hidecomm.jl):

    T_nohalo .= T[2:end-1,2:end-1,2:end-1];                                           # Copy data to CPU removing the halo.

where T would be a CUDA Array (selected by parallel_stencil) and T_nohallo a "standard" Array

Causing the following Error :

ERROR: LoadError: This object is not a GPU array Stacktrace: [1] error(s::String) @ Base ./error.jl:33 [2] backend(#unused#::Type) @ GPUArrays ~/.julia/packages/GPUArrays/VNhDf/src/device/execution.jl:15 [3] backend(x::Array{Float64, 3}) @ GPUArrays ~/.julia/packages/GPUArrays/VNhDf/src/device/execution.jl:16 [4] _copyto! @ ~/.julia/packages/GPUArrays/VNhDf/src/host/broadcast.jl:73 [inlined] [5] materialize! @ ~/.julia/packages/GPUArrays/VNhDf/src/host/broadcast.jl:51 [inlined] [6] materialize!(dest::Array{Float64, 3}, bc::Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{3}, Nothing, typeof(identity), Tuple{CuArray{Float64, 3, CUDA.Mem.DeviceBuffer}}}) @ Base.Broadcast ./broadcast.jl:868 [7] diffusion3D() @ Main /.../diffusion3D_multigpucpu_hidecomm.j:60 [8] top-level scope @ /.../diffusion3D_multigpucpu_hidecomm.jl:84 in expression starting at /.../diffusion3D_multigpucpu_hidecomm.j:84

So from my basic understanding it would appear that parallel_stencil doesn't allow interoperability between CUDA Arrays and Standard ones for broadcasting, is it no longer supported? Sorry in advance if this is a dumb issue, I have yet to find a workaround.

omlins / ParallelStencil.jl

Error copying Cuda Array to Array when using GPU #51