cuda-edu is a tool for students of the Coursera Heterogeneous Parallel Programming course that allows for homework assignments to be developed on a local machine without a CUDA GPU. It should be possible to use exactly the same source code with both cuda-edu and WebGPU. It is not officially sanctioned by the staff of Heterogenous Parallel Programming. It is just a tool created by a CTA (Community Teaching Assistant).
cuda-edu, essentially, emulates nvcc, libwb, and the CUDA runtimes. It translates your CUDA code into standard C++ code that can be executed on your CPU.
You can do local development and use your debugger to step through your code as it executes on your CPU. Also, cuda-edu injects code that will detect buffer overflows. Your program will trap immediately if you try to dereference a bad offset in your host, device-global, or device-shared buffers.
The primary requirements are a C++11 compiler and libclang. Currently, Linux, Mac, and Windows are supported.
Installation instructions are hosted on the Wiki. Please see the page for your OS: