AMReX-Codes / amrex

AMReX: Software Framework for Block Structured AMR
https://amrex-codes.github.io/amrex
Other
546 stars 347 forks source link

GPU support for BoxArray with custom class #625

Open bsrunnels opened 4 years ago

bsrunnels commented 4 years ago

I'm encountering an issue on GPU when creating FabArrays of BaseFabs that use a non-int, non-amrex::Real template argument. For instance the following line:

    amrex::FabArray<amrex::BaseFab<model_type> > mytmpmodel;
    mytmpmodel.define(grid,dm,1,2);

where model_type is a custom non-POD class. The define statement produces the runtime error

   amrex::Abort::0::CUDA error in file amrex/3d-debug-cuda-g++//include/AMReX_GpuDevice.H line 150 unspecified launch failure !!!

It works fine if I replace model_type with amrex::Real. I'm wondering if it is possible to use such class-templated MultiFabs on GPU, or if it is a limitation.

WeiqunZhang commented 4 years ago

Could you give more detail on model_type and where it crashes?

bsrunnels commented 4 years ago

Sure. Here's the definition of a representative model_type class: https://github.com/solidsuccs/alamo/blob/60f7ca790fb5cdb9295deaf71ed96a7090c8623d/src/Model/Solid/LinearElastic/Laplacian.H#L25

(Background: we store constitutive models in the model_type object. The link points to a Laplacian version, but we also have ones for isotropic materials, cubic materials, plastic materials, etc. These models can be spatially varying, so it makes the most sense to store them in a MultiFab structure.)

It looks like the crash occurs in amrex::Gpu::synchronize(), at AMReX_GpuDevice.H:149. I believe the error in stdout is returned from the call to cudaDeviceSynchronize.