Open amccaskey opened 3 weeks ago
I am able to reproduce a segmentation fault with this first example.
root@ea401e2-lcedt:/workspaces/cuda-quantum/examples/cpp# CUDAQ_LOG_LEVEL=info ./a.out
[2024-08-20 00:13:39.899] [info] [PluginUtils.h:24] Requesting N5cudaq16quantum_platformE plugin via symbol name getQuantumPlatform.
[2024-08-20 00:13:39.899] [info] [PluginUtils.h:36] Successfully loaded the plugin.
[2024-08-20 00:13:39.899] [info] [PluginUtils.h:24] Requesting N5nvqir16CircuitSimulatorE plugin via symbol name getCircuitSimulator.
[2024-08-20 00:13:39.899] [info] [PluginUtils.h:36] Successfully loaded the plugin.
[2024-08-20 00:13:39.942] [info] [NVQIR.cpp:82] Creating the custatevec-fp32 backend.
[2024-08-20 00:13:39.942] [info] [CircuitSimulator.h:901] Allocating 2 new qubits.
[2024-08-20 00:13:39.942] [info] [CuStateVecCircuitSimulator.cpp:170] GPU 0 Allocating new qubit array of size 2.
Segmentation fault (core dumped)
This could be a bridge issue in handling the const std::function<void(cudaq::qvector<> &)> &init
argument.
Looking at the generated code:
define void @_Z10userKernelRKSt8functionIFvRN5cudaq7qvectorILm2EEEEE({ i8*, i8* } %0) local_unnamed_addr
as compared to the LLVM one:
define linkonce_odr dso_preemptable void @_Z10userKernelRKSt8functionIFvRN5cudaq7qvectorILm2EEEEE(ptr noundef nonnull align 8 dereferenceable(32) %init) #5 personality ptr @__gxx_personality_v0 !dbg !3373
For some reason the argument is interpreted as a pair of pointers? This wrong argument assumption will crash the argsCreator
later.
Compiling the app in library mode (lib.o
was still compiled with MLIR mode) is okay; hence it's likely the problem.
@schweitzpgi Do we support std::function
arguments yet?
@schweitzpgi I see this in ConvertCCToLLVM.cpp
void cudaq::opt::populateCCTypeConversions(LLVMTypeConverter *converter) {
converter->addConversion([](cc::CallableType type) {
return lambdaAsPairOfPointers(type.getContext());
});
...
}
Looks like this is setup for just lambdas?
This is also interesting
define { i8*, i64 } @function_kernel_to_sample._Z16kernel_to_sampleRKSt8functionIFvRN5cudaq7qvectorILm2EEEEE.thunk(i8* nocapture readnone %0, i1 %1) {
%3 = tail call %Array* @__quantum__rt__qubit_allocate_array(i64 2)
unreachable
}
define i64 @function_kernel_to_sample._Z16kernel_to_sampleRKSt8functionIFvRN5cudaq7qvectorILm2EEEEE.argsCreator(i8** nocapture readnone %0, i8** nocapture writeonly %1) #2 {
...
Just a guess, but could this be why we see a seg fault in the argsCreator function? The thunk is getting called by altLaunchKernel, and we hit this unreachable line, with the next spot in memory the argscreator ???
Here's a test repo for all this
https://github.com/amccaskey/test_cudaq_cpp_py_integration
mkdir build && cd build
cmake .. -G Ninja -DCUDAQ_DIR=/path/to/cudaq/lib/cmake/cudaq -DCMAKE_BUILD_TYPE=Debug
ninja
PYTHONPATH=/path/to/cudaq:$PWD gdb --args python3-dbg test.py
Thanks for the heads-up. I'll add this to my list to look at.
This may be interesting.
% nvq++ --enable-mlir -fkernel-exec-kind=2 -fPIC -g -c lib.cpp -o lib.o
% nvq++ --enable-mlir -fkernel-exec-kind=2 -g -fPIC lib.o user.cpp
% ./a.out
terminate called after throwing an instance of 'std::runtime_error'
what(): Wrong kernel launch point: Attempt to launch kernel in streamlined for JIT mode on local simulated QPU. This is not supported.
Aborted
%
Take the following files
and
Compile and link with the following
This results in a segmentation fault.
Can anyone else reproduce this? I would be very thankful for anyone's help on this one. This kind of pattern will be a primary feature of future downstream libraries.
Another variation would be this