LLVM Opaque Pointers feature not compatible with named pointers for Qubit and Result

swernli commented 1 year ago

The LLVM project has had an ongoing workstream to define behavior for treating all pointers as opaque by default (see Opaque Pointers in the LLVM docs), and the feature is on by default in LLVM 15 with typed pointers being removed in the upcoming LLVM 16. This means the current mechanism used by QIR to identify qubit and result types as distinct from other values via named pointer types will no longer work. For example, the measurement call in the base profile that today looks like this:

call void @__quantum__qis__mz__body(%Qubit* null, %Result* writeonly null)

would become this:

call void @__quantum__qis__mz__body(ptr null, ptr writeonly null)

As noted in the opaque pointers documentation,

Frontends need to be adjusted to track pointee types independently of LLVM, insofar as they are necessary for lowering.

This means that type information for pointers is explicitly and intentionally no longer included in the LLVM IR and meant to inserted when the pointer is read from or written to via load and store instructions. This does not map well to the current design for %Qubit* and %Result* which expects that these values are never loaded or stored and are merely identified as distinct by their named types.

As a possible design solution going forward, the %Qubit and %Result types could be more formally defined as distinct, sized structs, which would allow them to retain their type information in the IR. Correspondingly, the specifications and profiles would need to updated with this definition and the signatures of all relevant APIs updated to reflect that. A similar effort would need to be undertake for other types currently defined as typed pointers such as the %String*, %BigInt*, %Tuple*, and %Array*.

nikic commented 1 year ago

The "target extension type" concept being introduced in https://reviews.llvm.org/D135202 may be of interest to you. It addresses similar needs that have been encountered with SPIR-V.

peter-campora commented 1 year ago

First thoughts on this.

I wouldn't like seeing: call void @__quantum__qis__mz__body(ptr null, ptr writeonly null)

To someone consuming the code, they can't quickly verify the difference between these two pointers were it not for the writeonly convention. This makes it harder to detect if some codegen bug put incorrect id's into the wrong argument to these function call (for example swapping the two ptr args in the call above would only be detectable as a bug because of the use of writeonly).

To the point about replacing %Qubit* with a sized struct, could we imagine something like: %Qubit = type { i64, ptr } where the underlying ptr could still be used to point to some other type defined by a back-end like: %BackendQubit that a back-end provider knows how to work with when dealing with loads and stores? And similarly for %Result*? It would definitely cause some changes to code-gen since it's not as easy to pass these structs as arguments and identify the unique id as it was using inttoptr instructions. But I think with the right tooling and conventions it may not be too much of an ergonomic overhead. Downside, there may be some performance overhead for people directly linking definitions for these functions when compared with passing a pointer at call-sites?

swernli commented 1 year ago

@nikic Thank for pointing out the target extension type! If I'm understanding it correctly, a target extension type can optionally include type information, but if it doesn't, then it can't be introspected. Is that correct? If so, that does seem to match the intended behavior of Qubit and Result. These would then become target("Qubit") and target("Result"). After transformation to static Qubit and result allocation, we'd need a way to identify particular instance. Today that uses inttoptr for named pointer types. Is there a similar strategy we could employ for target extension types? Would it have to use a function call?

nikic commented 1 year ago

After transformation to static Qubit and result allocation, we'd need a way to identify particular instance. Today that uses inttoptr for named pointer types. Is there a similar strategy we could employ for target extension types? Would it have to use a function call?

So you're currently doing something like inttoptr i64 42 to %Qubit*? In that case yes, I think converting that to something like call target("Qubit") @llvm.qubit.from.int(i64 42) or so would be the right way to represent that.

schweitzpgi commented 1 year ago

It's been 6 months since the last post on this topic. Any thoughts or recommendations?

qir-alliance / qir-spec

LLVM Opaque Pointers feature not compatible with named pointers for Qubit and Result #30