comp-imaging / ProxImaL

A domain-specific language for image optimization.
MIT License
112 stars 29 forks source link

Major revamp to Halide 16.0 with Anderson2021 GPU autoscheduler #67

Open antonysigma opened 1 year ago

antonysigma commented 1 year ago

(Adding the task dependencies for my own reminder.)

  1. [x] Wait for the Halide 16.0 release.
  2. [x] Refactor the Halide::BoundaryConditions calls to use the new APIs;
  3. [x] Similarly, refactor Generator::* related code to use Halide 16.0 APIs;
  4. [x] In algorithms/ladmm.py, ensure all Numpy matrices are Fortran order by default; this avoids the frequent C-order to F-order typecasting overhead in the (L-)ADMM iterations;
  5. [x] Similarly, ensure Halide-accelerated linear operators, e.g. A_mask.cpython.so writes to the output buffers in F-order, not some orphan buffers that are immediately destroyed. This should solve the convergence failure bugs whenever implem='Halide' is defined.
  6. [x] Wait until Anderson2021 algorithm optimizer is ready for production (https://github.com/halide/Halide/issues/7606).
  7. (Optional) Compile the Halide generators with C++20; this should cut the compile time in half thanks to new C++ Concepts feature;
  8. (Optional) reduce code bloat of ladmm-iter-gen.cpp with the broadcast operator Halide::_.
  9. [ ] Replace Li2018 autoscheduler with Anderson2021: the latter utilizes the GPU cache and shared memory in the SM far better.

References: https://github.com/halide/Halide/pull/6856 https://github.com/halide/Halide/issues/7459