AMReX-Codes / pyamrex

GPU-Enabled, Zero-Copy AMReX Python Bindings including AI/ML
http://pyamrex.readthedocs.io
Other
32 stars 15 forks source link

`.to_numpy(copy=False)` Runtime Error if Device Memory #201

Open ax3l opened 9 months ago

ax3l commented 9 months ago

We allow users to call .to_numpy(copy=False) on arbitrary memory.

This is fine even with pure GPU memory, to either:

For the situation that the pointer is in GPU memory and not managed, we should instead raise a runtime exception with the hint to use .to_numpy(copy=True), .to_cupy(copy=False) or activate managed memory.

We can use AMReX_GpuUtility.H for isManaged, isDevicePtr, isPinnedPtr helpers. It wraps cudaPointerGetAttributes and, once later supported, similar functions for HIP and SYCL. https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__UNIFIED.html#group__CUDART__UNIFIED_1gd89830e17d399c064a2f3c3fa8bb4390

ax3l commented 9 months ago

These function can be quite expensive, so we should either use them sparsely or check alternatively the arenas if we know them.

These situations in AMReX can create this:

Then Arena has

    // isDeviceAccessible and isHostAccessible can both be true.                                                                                                             
    [[nodiscard]] virtual bool isDeviceAccessible () const;
    [[nodiscard]] virtual bool isHostAccessible () const;

    // Note that isManaged, isDevice and isPinned are mutually exclusive.                                                                                                    
    // For memory allocated by cudaMalloc* etc., one of them returns true.                                                                                                   
    // Otherwise, neither is true.                                                                                                                                           
    [[nodiscard]] virtual bool isManaged () const;
    [[nodiscard]] virtual bool isDevice () const;
    [[nodiscard]] virtual bool isPinned () const;

where isHostAccessible() is what we need.

ax3l commented 9 months ago

For MultiFab.array(mfi).to_numpy() we could go on the MultiFab level and add a:

MultiFab.to_numpy(mfi, ...)

function, that way we have still access to the Arena (or implement the more costly helper calls on the pointer from AMReX_GpuUtility.H above).