Open crazy4pi314 opened 4 years ago
See also the discussion on https://github.com/microsoft/qsharp-compiler/issues/406, where the issue was originally filed.
@crazy4pi314 Thanks for capturing all this - this is great! I've got a couple of questions/comments:
I am looking forward to the proposal on type parameterized udts ;).
You mention LittleEndian or BigEndian as potential use cases of the same DU if I understand correctly. I am a bit skeptical about that particular example, since that would basically force that each callable that takes such an argument is implemented for both. I don't think that is necessary and see more value in having a library that uniformly supports just one of the two (e.g. LittleEndian) and indicates when a callable depends on this interpretation by consuming explicitly a LittleEndian.
"This is similar to how I have had to implement what here would be a Maybe DU" Is the code open source by any chance and do you have a link?
I am not sure it should be allowed that a case has the same name as the/a type, and in particular not if "[...] the case constructor functions will be declared as globals in the same namespace that the DU is declared in."
Even if current udts are considered to be single case DUs as it is currently suggested the case could have a fixed and reserved default name. I am curious regarding what the motivation is for wanting to treat all custom types as DUs?
What might be worth exploring is how DUs where all cases take an argument of type Unit
relate to enum-like constructs, such as the built in Pauli
and Result
type.
The large number of ways to express a DU declaration is worrying me a bit. I was wondering is the first |
shouldn't just be required. It is also worth discussing support for custom constructor under impact with future mechanism or related mechanisms, since the syntax will need to be carefully aligned with custom constructors in general.
The text under Example 3 for declarations I assume should go with Example 4.
As a nitpick: In Example 3 for case constructors, declaring a variable with name return
isn't supported since return
is a reserved keyword.
There is a lot to dive into for the match statement or expression suggested for case deconstruction, and I think it would make sense to potentially consider it as a separate proposal with DUs being a related future modification to keep in mind when working that out. Similarly, I'd want to make anonymous DUs a separate proposal that can be considered under interactions with future mechanisms.
My suggestion for how to proceed is to first focus on the match statement, since it is the piece that can be introduced independently, and I believe will be required to move forward with this. What do you think?
As a side note, it looks like anonymous DUs are on track for possible inclusion in F# with FS 1092: https://github.com/fsharp/fslang-design/pull/512! 💕
I'd like to emphasize the importance of this feature request to add discriminated unions to the language as this would increase programming safety. Concrete case in point: In a project by a customer at https://github.com/npbauman/COVOs-Azure/blob/guenp/vqe-nb/VQE%20for%20LiH%20-%20debugging-Copy1.ipynb
operation PrepareUCCSDAnsatz(JWEncodedData: JordanWignerEncodingData, theta1: Double, theta2: Double, theta3: Double, register: Qubit[]) : Unit is Adj {
let (nSpinOrbitals, fermionTermData, inputState, energyOffset) = JWEncodedData!;
let (stateType, JWInputStates) = inputState;
let inputStateParam = (
3, // Prepare the UCCSD state
[
JordanWignerInputState((theta1, 0.0), [2, 0]),
JordanWignerInputState((theta2, 0.0), [3, 1]),
JordanWignerInputState((theta3, 0.0), [2, 3, 1, 0]),
JWInputStates[0]
]
);
Message($"{inputStateParam}");
_PrepareTrialState(inputStateParam, register);
}
there was a bug in the definition of the first component of inputStateParam
that was difficult to trouble shoot and that eventually could be traced down to an incorrect dispatch on numerical values (namely 1, 2, 3 of int type), depending on a VQE chemistry ansatz chosen, which likely would have been prevented by having a discriminated union type, say e.g. (fci-ansatz | fci-sparse-ansatz | qcc-ansatz).
As choosing an incorrect value for the ansatz did not lead to a compilation or run-time error but rather in incorrect final energies, debugging of this bug took a long time and involved several people. This might have been avoided if the dispatch could have been done on matching the values of a union type. @cgranade @guenp @crazy4pi314 @bettinaheim @efratshabtai
Plus one for DUs! 💯 In F# they are one of the key features to make coding such a safe, parsimonious & pleasant experience. So pleased to see this being added to Q#. Bravo!
Suggestion
This suggestion outlines how to add discriminated unions to Q#. The overall philosophy/objective with this design is to use it to extend/generalize the current design of user defined types (UDTs). It may not break current UDT syntax, and current UDTs should be thought of as single-case DUs.
NB: This was originally proposed on the compiler repo, and there is a lot of existing discussion there, most of which has been incorporated here. Thanks to @samarsha, @cgranade, @msoeken and @guenp for input to this proposal 💖
Considerations
Firstly, discriminated unions (DU), or tagged unions, are a fairly common language feature for function languages and will allow users of Q# to leverage coding patterns they are already familiar with. Scala calls DUs case classes, Rust has DUs called enums, Haskell has sum types, and F# calls them DUs. This gives us a lot of examples to draw from and evaluate to create the right feature for Q#.
Secondly, DUs can make code more robust as they allow you to use the type system to reduce bugs or unhandled edge cases. In Examples there are some samples of how DUs can improve common Q# patterns and usage. The main patterns I have run into where they might be useful fall into the following categories:
LittleEndian
orBigEndian
cases ofQubit[]
)Maybe<Int>
), which requires type parameterized UDTs proposed hereIf this is implemented in a way that allows all current UDT syntax to be valid, then this shouldn't be a breaking change and should just be strictly a feature add.
It is not clear to me there is another way to achieve this functionality other than creating a UDT where each field is a valid case/tag and you would use if statements to filter on the populated field manually. This is similar to how I have had to implement what here would be a Maybe DU where I used a tuple
(Bool, (Int, Int))
as a return type signature (see this issue). There are certainly other designs for how to add DUs to Q#, but this proposal focuses on making it a non-breaking change by extending current UDTs.For the formatting applications, @samarsha proposed the following as an alternate solution for the example of endianess that can be used right now in Q#:
This would definitely work, but I think as Sarah notes, that it doesn't really achieve the conceptual abstraction and the user will still have to track some conversions.
Context
The most important feature of this design approach is that it feels natural with the current implementation for UDTs, and allows for all current UDT syntax to be valid.
DUs have the following components:
* DUs could also include anonymous DUs as detailed by @cgranade, @guenp and @msoken here and at end of examples.
The following section goes into much more detail on what each of these parts could look like for Q#. Much of the suggested syntax here comes from the examples that I looked in this reference on F#. As noted above though, there are many languages that have similar features, but the F# syntax seemed to match most closely with established patterns in Q#.
Examples
The three main syntactic things that need prototyped to extend UDTs to DUs:
To that end, the
newtype
keyword would be used to declare new DUs as well.@samarsha notes that:
The following sections give example suggested designs for the three syntaxes listed above.
DU declarations
Since the goal is to make this feel natural with the current implementation for UDTs, here is a quick review of that syntax.
You can see it uses the statement
newtype
to declare a new UDT, and there are named items in the defininition itself. For more details and examples see the docs.The following examples would all be valid syntax for this design.
Example 1:
Standard UDT syntax where the constructor function implicitly will have the same name as the type.
Example 2:
Standard UDT syntax where the constructor function is explicit.
Example 3:
A two case DU where there is no leading bar.
Example 4:
A two case DU where a complex number can have one of two different coordinate systems.
Example 5:
Standard UDT syntax where the constructor function is explicit, and has a leading bar.
Example 6:
Standard UDT syntax where the constructor function is implicit, and has a leading bar.
Example 7:
A two case DU where there is no leading bar.
Example 8:
A multi-case DU where all the case types are
Unit
, a.k.a. an enum. Could also potentially omit parens.Example 9:
A multi case DU that generalizes implementation options for an adder.
The design here also allows for type parameterization (see this proposal for parameterized UDTs). This would significantly expand what whe can do with DUs, in particular create a
Maybe
DU as shown in this example. The type parameter'T
can be any other valid type in Q#, see the docs for more info.Example 10:
A parameterized two case DU where there is a leading bar.
Case constructors
You have seen some of the syntax for the case constructors already, they are the bits similar to function declaration syntax (i.e.
Some(Int)
). One way to think about the deign here is that you are defining a collection of UDT constructors, one for each case. You are declaring cases by declaring their constructors. That means that the case constructor functions will be declared as globals in the same namespace that the DU is declared in. As a consequence of that, then you will not be able to have DUs with case constructors that have the same fully qualified name. I don't think at the moment that is a problem as it follows kind of the same pattern as you cannot have a UDT and a function share the same name in a namespace.Examples of valid case constructor syntax:
Example 1:
Given the
Maybe<'T>
declaration, declaring a case that returns a type ofMaybe<Int>
.Example 2:
Constructs a register that represents an integer, that can be interpreted as either little or big endian.
Example 3:
Case constructor that has type Unit.
Case deconstructors
For the deconstructor here, an expression design could make a lot of sense. That allows you to set a variable equal to the result of a deconstruction statement. See the following examples of how this could work.
Example 1:
Example 3 from the previous section, and we use
match
to deconstruct the cases possible for Maybe<'T>.Use case: Library simplification
Via @cgranade in previous thread.
From @msoeken in previous thread, I can maybe stub an example for an
enum
ofUnit
cases:Use case: Anonymous DUs
Detailed by @cgranade, @guenp and @msoken here:
Affidavit (please fill out)
Please add ticks by placing a cross in the box:
Please tick all that apply: