Open bmhowe23 opened 3 weeks ago
I believe everything was operating correctly up until https://github.com/NVIDIA/cuda-quantum/pull/1470, which was introduced in 0.7.1. That PR changed how the MLIR was structured for some loops, and while nothing was incorrect about that PR per se, that change apparently made us sensitive to underlying LLVM bug.
Required prerequisites
Describe the bug
When using nested loops where the inner loop doesn't start at 0, the circuit compilation can produce incorrect circuits.
Note that the ultimate root cause of this issue is https://github.com/llvm/llvm-project/issues/94520, but there are more details below.
Also note that the current manifestation of the bug was accidentally fixed in #1692, so the latest from main will not show this bad behavior, but see the "suggestions" for more details.
Steps to reproduce the bug
The issue can best be seen with the following test case (thanks to @poojarao8 for the clean example):
Using
nvcr.io/nvidia/quantum/cuda-quantum:0.7.1
...Expected behavior
The expected behavior is shown above.
Is this a regression? If it is, put the last known working version (or commit) here.
Not a regression
Environment
Suggestions
This was actually fixed with #1692 (which will be part of 0.8.0), despite #1692 having no intention of fixing this bug. That PR simply reordered some of the QIR passes, and subsequent research determined that the reorder was masking an underlying LLVM/MLIR issue. The underlying root cause was reported here: https://github.com/llvm/llvm-project/issues/94520.
Since the issue still exists in LLVM upstream as of 19.0, I think we need to perform some or all of the mitigation steps in CUDA-Q. The exact list that we choose to implement is up for debate.
applyPatternsAndFoldGreedily
by doingapplyPatternsAndFoldGreedily(..., {.enableRegionSimplification = false})
.