microsoft / QuantumLibraries

Q# libraries for the Quantum Development Kit
https://docs.microsoft.com/quantum
MIT License
542 stars 179 forks source link

API: Consider the formatted output/logging functions #419

Open kuzminrobin opened 3 years ago

kuzminrobin commented 3 years ago

(Request For Early Feedback. Work in Progress)

Latest State


Earlier Story

(Links DoubleAsStringWithFormat(), IntAsStringWithFormat(), Standard numeric format strings, Custom numeric format strings).

Discussion:

[12:42 PM] Bettina Heim Hi all, As part of adding support for Q# to the QIR runtime, we came across the logging functions that allow to give a formatting option. Right now, the option seems to be exactly what C# provides. Since this is pretty extensive and I am not eager to blindly duplicate the C# behavior in our native implementation, I would like to gather your input regarding what formatting options we need and want to support going forward? [12:50 PM] Chris Granade (THEY/THEM) There's not a single line of Q# in all of our libraries or samples that calls either function... Thinking to kind of first principles for the language and libraries, though, the whole point of Message is to provide optional logging information back to the host. To that end, I wouldn't think that it's intended that we should provide UX features like formatting at the Q# level. I know Bettina, Alan and I played around with the concept of a function that would emit logging info other than strings years ago (e.g.: if a Q# program wants to send a Double to the host as a log message); that may be better than trying to handle formatting at the Q# level. [12:57 PM] Bettina Heim Chris Granade (THEY/THEM) Do you know why these have been added in the first place? Do you know of anyone else that could be using them? Would it be good to keep some minimal formatting options, or would you suggest deprecating them? My inclination would be to allow for some minimal formatting, but definitively not the extensive support that C# provides. Robin Kuzmin, What kind of formatting support does C++ provide? [1:32 PM] Robin Kuzmin The simple option is the printf() formatting of C, see "Conversion specifiers" (or a broader picture - "Format of the format string") in the printf() manual page. As for C++, let me summarize... As a short answer I would say, C++ formatted output is harder to adapt as an implementation of the Q# calls. And that C++ formatted output would be described here. E.g. see "hex" that will take you to here and show the examples like std::cout << std::hex, after which all the integer output will be in hexadecimal form.

[1:48 PM] Robin Kuzmin As a minimalistic implementation I would say that we need

[1:48 PM] Robin Kuzmin And most (if not all) of that is already implemented, I believe.

[2:07 PM] Chris Granade (THEY/THEM)

Do you know why these have been added in the first place?

I don't, no; they are amongst the first intrinsics added to the language, IIRC, back when everything was somewhat chaotic.

Do you know of anyone else that could be using them? Would it be good to keep some minimal formatting options, or would you suggest deprecating them?

I'm not aware of such, no; was unable to find a single usage when I looked, so I'm not sure there's a lot of demand for that feature. My initial thought would be to deprecate entirely for now, but happy to discuss further.

[8:20 PM] Robin Kuzmin I looked at .NET's (C#'s) formatting - Standard numeric format strings and Custom numeric format strings. looks like for the minimalistic implementation (mentioned in my previous 2 posts) we also need (EN-US locale):

msoeken commented 3 years ago

The new C++ standard formatting library is similar to Python's formatting capabilities.

cgranade commented 3 years ago

@kuzminrobin: Thank you for opening this; it sounds like there's a few different suggestions made in the thread above. Would you be willing to clarify what changes you'd propose to the Q# API surface so that we can get that scheduled for Thursday's API review board meeting? Thank you!

kuzminrobin commented 3 years ago

@cgranade, @bettinaheim, Yes, it definitely makes sense to come up with a particular proposal. I'm diving into this and will try to figure out the minimalistic formatting proposal by the API Review Meeting.

@msoeken, thank you for sharing your knowledge! The C++ standard formatting library has been added in C++20, there can be temporary issues with that. In particular, we currently compile our QIR/C++ code with C++14 compiler flag, and the attempt to use C++17 causes compilation errors in the headers provided by MSVC++. I didn't try C++20, this can require an update to the clang/LLVM toolchain, but I expect even more issues with migration to C++20. So the C++ standard formatting library is not something that we can immediately start using (it can require updates in MSVC++). None the less I don't see that as a blocking issue. We can still move towards that kind of formatting in the minimalistic formatting facility in Q# and implement that formatting manually in C++14 for now, and later on we can replace the manual implementation with the direct calls to the C++ standard formatting library. It seems to me a good idea to move towards the formatting already known (to the broad audience) from the popular programming languages, such as Python (and less known C++20 features). I'm working on this.

kuzminrobin commented 3 years ago

Wow!

(C++20) Standard format specification For basic types and string types, the format specification is based on the format specification in Python.

That's definitely a way to go...

kuzminrobin commented 3 years ago

(Work in progress. Can change)

For Int

Either FormattedI() or Interpolated String Literals.

FormattedI() Proposed:

namespace Microsoft.Quantum.Convert {
    function FormattedI(fmt : String, value : Int) : String { 
        body intrinsic; 
    }
}

Based on Python's Format Specification Mini-Language. Usage Example:

// Hexadecimal representation for Int:
let intNum = 42;
let logString = FormattedI("x", intNum);
// Todo: Test corresponding formatting in C++.
// Format String    logString
// "x"             "2a"
// "#x"            "0x2a"
// "04X"           "002A"
// "#06X"          "0X002A"
// "#015_X"        "0X000_0000_002A"    // Optional
Message(logString);

Interpolated String Literals

If chosen then the topic is to be considered in a separate proposal against the qsharp-language. Based on

Usage Example:

// Hexadecimal representation for Int:
let intNum = 42;
// Todo: Test corresponding formatting in C++.
// Harder to implement, likely requires language change (grammar change):
let logString = $"dec: {intNum}; hex: {intNum:x}; hex: {intNum:#x}; hex: 0x{intNum:04X}; hex: {intNum:#015_X}";
// "dec: 42; hex: 2a; hex: 0x2a; hex: 0x002A; hex: 0X000_0000_002A"
// "dec: {intNum}" is already implemented.
Message(logString);

Tested in Python:

>>> intNum = 42
>>> f"dec: {intNum}; {intNum:x}; {intNum:#x}; 0x{intNum:04X}; {intNum:#06X}; {intNum:#015_X}"
'dec: 42; 2a; 0x2a; 0x002A; 0X002A; 0X000_0000_002A'
kuzminrobin commented 3 years ago

(Work in progress. Can change)

For Double

Either FormattedD() or Interpolated String Literals.

FormattedD() Proposed:

namespace Microsoft.Quantum.Convert {
    function FormattedD(fmt : String, value : Double) : String { 
        body intrinsic; 
    }
}

Usage Examples:

// Fixed point representation for Double:
let doubleNum = 1234567.8901678;   // Or NaN() or <infinity>.
let logString = FormattedD(".4f", doubleNum);
// Todo: Test corresponding formatting in C++.
// Format String    logString
// ".4f"            "1234567.8902"  // Or `nan` or `inf`.
// ".4F"            "1234567.8902"  // Or `NAN` or `INF`. Optional.
// "012.2f"         "001234567.89"  // Or `000000000nan` or `000000000inf`.
Message(logString);
// Scientific notation for Double:
let doubleNum = 1234567.8901678;   // Or NaN() or <infinity>.
let logString = FormattedD(".4e", doubleNum);
// Todo: Test corresponding formatting in C++.
// Format String    logString
// ".4e"            "1.2346e+06"  // Or `nan` or `inf`
// ".4E"            "1.2346E+06"  // Or `NAN` or `INF` Optional.
// "010.2e"         "001.23e+06"  // Or `0000000nan` or `0000000inf`.
Message(logString);

Interpolated String Literals

If chosen then the topic is to be considered in a separate proposal against the qsharp-language. Usage Examples:

let doubleNum = 1234567.8901678;   // Or NaN() or <infinity>.
let logString = $"{doubleNum:.4f}, {doubleNum:.4F}, {doubleNum:012.2f}";
// Todo: Test corresponding formatting in C++.
// "1234567.8902, 1234567.8902, 001234567.89"
// "nan, NAN, 000000000nan"
// "inf, INF, 000000000inf"
let doubleNum = 1234567.8901678;   // Or NaN() or <infinity>.
let logString = $"{doubleNum:.4e}, {doubleNum:.4E}, {doubleNum:0010.2e}";
// Todo: Test corresponding formatting in C++.
// "1.2346e+06, 1.2346E+06, 001.23e+06"
// "nan, NAN, 0000000nan"
// "inf, INF, 0000000inf"

Tested in Python:

>>> doubleNum = 1234567.8901678
>>> f"{doubleNum:.4f}, {doubleNum:.4F}, {doubleNum:012.2f}"
'1234567.8902, 1234567.8902, 001234567.89'

>>> import math
>>> doubleNum = math.nan
>>> f"{doubleNum:.4f}, {doubleNum:.4F}, {doubleNum:012.2f}"
'nan, NAN, 000000000nan'

>>> doubleNum = math.inf
>>> f"{doubleNum:.4f}, {doubleNum:.4F}, {doubleNum:012.2f}"
'inf, INF, 000000000inf'

>>> doubleNum = 1234567.8901678
>>> f"{doubleNum:.4e}, {doubleNum:.4E}, {doubleNum:010.2e}"
'1.2346e+06, 1.2346E+06, 001.23e+06'

>>> doubleNum = math.nan
>>> f"{doubleNum:.4e}, {doubleNum:.4E}, {doubleNum:010.2e}"
'nan, NAN, 0000000nan'

>>> doubleNum = math.inf
>>> f"{doubleNum:.4e}, {doubleNum:.4E}, {doubleNum:010.2e}"
'inf, INF, 0000000inf'
cgranade commented 3 years ago

(Work in progress. Can change)

Duly noted; in the interest of getting this through quickly, please take the following as early feedback.

Format Specification Mini-Language

// Hexadecimal representation for Int:
let intNum = 42;
let logString = Format(intNum, "x");
// Format String    logString
// "x"             "2a"
// "#x"            "0x2a"
// "04X"           "002A"
// "#04X"          "0x002A"
// "#012_X"        "0x0000_0000_002A"    // Optional
Message(logString);

Format examples

In terms of Q# style guide and design principles, we'll need a noun or adjective phrase instead of Format, since this new callable should likely be a function rather than an operation.

Aside from that, which order of inputs is most consistent with currying? I'd imagine the same format string getting applied to multiple different numeric values would be significantly more common than the same numeric value getting formatted with different strings, such that Formatted : (String, Int) -> String might make more sense.

With respect to input types, is it expected that this should only ever work on integers? If we want formatted strings for other kinds of inputs, then we'll need type suffixes to distinguish them:

function FormattedI(fmt : String, value : Int) : String { ... }
function FormattedL(fmt : String, value : BigInt) : String { ... }
function FormattedD(fmt : String, value : Double) : String { ... }

Other questions:

// Hexadecimal representation for Int:
let intNum = 42;
// Harder to implement, likely requires language change (grammar change):
let logString = "dec: {intNum}; hex: {intNum:x}; hex: {intNum:#x}; hex: 0x{intNum:04X}";
// "dec: 42; hex: 2a; hex: 0x2a; hex: 0x002A"
Message(logString);

@bettinaheim can speak to this better than I can, but if you want to go on and suggest that as a language change, filing a Q# suggestion would be the right next step to kick off that process. I would be slightly concerned with the use of : here, though, given that in Q# a single colon means exclusively "has the type of," but I digress.

kuzminrobin commented 3 years ago

Thank you for the early feedback, @cgranade.

(For oneself: To do: See also qsharp-runtime\src\Simulation\QSharpFoundation\Convert\Convert.qs:

    function DoubleAsString(a : Double) : String {
        return $"{a}";
    }

BoolAsString(), IntAsString(), fail $"Unexpected Pauli value {p}.", microsoft/qsharp-language#81,

msoeken commented 3 years ago

Thanks for your response, @kuzminrobin

The C++ standard formatting library has been added in C++20

This just means that it has been to the standard version since C++20, and in the future compilers will pick up implementing it into the standard libraries. However, there is a popular implementation of the formatting proposal, called libfmt, which can be used with earlier versions of C++. But as you already mentioned, the main take away is the formatting syntax and it's familiarity to Python.

kuzminrobin commented 3 years ago

(Work in progress. Can change)

What should happen to IntAsStringWithFormat and other similar existing functions? Should those be deprecated?

DoubleAsStringWithFormat(), IntAsStringWithFormat(): It is proposed to deprecate them. I don't feel I have enough experience to propose the deprecation period at the moment. See Q# API Design Principles, search for "deprecation period".

kuzminrobin commented 3 years ago

How does this relate to other current or proposed Q# features? E.g., how would this API change if we get discriminated unions feature from microsoft/qsharp-language#51?

I have ran through the Discriminated Unions twice, and through the Pattern matching and match expressions. For now I don't foresee any contradictions.

cgranade commented 3 years ago

What should happen to IntAsStringWithFormat and other similar existing functions? Should those be deprecated?

DoubleAsStringWithFormat(), IntAsStringWithFormat(): If it is known for sure that nobody really uses them, then to exclude from Q# right away. If there is any risk that somebody might be using them, then to deprecate them.

I don't know of any usage of either DoubleAsStringWithFormat or IntAsStringWithFormat, but I also cannot preclude that there's usage we've missed. In either case, however, as a matter of our API design principles work to make sure that when possible, we deprecate for at least six months before removing:

image


I have ran through the Discriminated Unions twice, and through the Pattern matching and match expressions. For now I don't foresee any contradictions.

Fair enough, thanks for looking at that. It can also be important to make sure we don't miss an opportunity for other designs using future functionality; e.g., could formatting options could be handled in a structured way by using DUs instead of string formats to help ensure at compile time that formats are sensible?

newtype FloatingPointFormat = (
    LeadingZero: Bool,
    Precision: Maybe<Int>,
    Width: Maybe<Int>
);

function FormattedD(fmt : Maybe<FloatingPointFormat>, value : Double) : String { ... }
function DefaultFloatingPointFormat() : FloatingPointFormat { return FloatingPointFormat(false, None(), None()); }

// ...

FormattedD(None(), value); // use default format
FormattedD(DefaultFloatingPointFormat() w/ LeadingZero <- true, value); // {0f}

Similarly, we could imagine a type-safe version of the integer formatting options:

newtype IntegerDisplayRadix = Hex() | Decimal() | Octal() | Binary();
newtype IntegerFormat = (
    Radix: IntegerDisplayRadix,
    Width: Maybe<Int>,
    ThousandsSeparator: Maybe<String>
);

Given that Q# is ideally intended to minimize runtime failures by using tools such as type safety, such a design could help avoid failures if a user passes an incorrect string format (e.g.: ".4y", using an unrecognized format suffix "y"). Even though that Q# language suggestion is pretty far out (assuming it's accepted at all), I personally find it's really helpful to avoid future breaking changes to think about what library designs may look like if we adopt various language features in the future.

kuzminrobin commented 3 years ago

It can also be important to make sure we don't miss an opportunity for other designs using future functionality; e.g., could formatting options .. be handled in a structured way by using DUs instead of string formats to help ensure at compile time that formats are sensible?

Very curious insight... For now I feel somewhat "short-sighted" in Q# to estimate that. But you raised a very good point.

Ideally I would prefer to reserve a space in the language/library for providing both options for the users to choose.

Is there any proposal for the multiple overloads to be supported in Q#? Such that both the function FormattedD(fmt : Maybe<FloatingPointFormat>, value : Double) : String { ... } and function FormattedD(fmt : String, value : Double) : String { ... } can co-exist in the same namespace? Or maybe the idea of overloads contradicts the philosophy of the language?

bettinaheim commented 3 years ago

@kuzminrobin Overloading in Q# does not exist, but we do support type parameterizations. Currently, any argument item whose type is parameterized can only be "treated as a black box", meaning you cannot do anything with it that would require type information. For the future, something like type classes could give a way to make these more powerful and allow something like you asked about above. This is still going to be quite some time out though.

A question on the formatting: I currently only see the option to format integers in hexadecimal, and not binary; is that correct did I overlook that? Does python indeed not have the option to output integers in binary format? Another question is also that at least for quite some time we will need to support both the C# runtime and the QIR runtime; have you looked at whether the proposed formatting is also implementable in C# with reasonable effort?

cgranade commented 3 years ago

Also the users may be familiar with the string-based formats.

Agreed, which is why I suggested the DU-based format only as a point of discussion; I'm not convinced that would be better necessarily, so much as that there are some ways in which compile-time verification fits into the design of Q#. It may be we can do similar with stringly typed formatting specifications (e.g.: never fail on invalid formats, but fall back to a string like "<unknown format>", so that quantum programs are not halted by runtime failures in string formatting).

Ideally I would prefer to reserve a space in the language/library for providing both options for the users to choose.

Is there any proposal for the multiple overloads to be supported in Q#? Such that both the function FormattedD(fmt : Maybe<FloatingPointFormat>, value : Double) : String { ... } and function FormattedD(fmt : String, value : Double) : String { ... } can co-exist in the same namespace? Or maybe the idea of overloads contradicts the philosophy of the language?

To @bettinaheim's point, we don't have any language functionality to support that currently (indeed, this is why we need type suffixes like the D and I after Formatted). For the discussion on typeclasses, please see https://github.com/microsoft/qsharp-language/issues/149. For the discussion of anonymous DUs (e.g.: FormattedD(fmt : FloatingPointFormat | String | Unit, value : Double) : String), please see https://github.com/microsoft/qsharp-language/issues/51.

In either case, I don't mean to derail coming up with a proposal that works well in Q# as it is right now, so much as to try and keep an eye on the future so as to minimize any breaking changes we may need if those language suggestions and proposals are eventually adopted. For example, one resolution could be that string-based formatting specifications work well because if we eventually get anonymous DUs and/or typeclasses, that we could use those to provide something similar to overloads in this case, adding new functionality without breaking old.

kuzminrobin commented 3 years ago

I currently only see the option to format integers in hexadecimal, and not binary; is that correct did I overlook that? Does python indeed not have the option to output integers in binary format?

@bettinaheim, while providing the usage examples I was thinking about illustrating the minimalistic implementation only, i.e. at least the hexadecimal for Ints and at least the fixed point and scientific notation for Doubles (assuming that I will do at least minimalistic implementation in case we need to implement that manually, and/or if we are short of time, etc.). My current illustrations are based on the pass through the Python documentation only, for now. Python formatting is much richer than my illustrations. Python got binary, octal, decimal formatting too (in addition to alignment, filling, digit grouping, etc.). I still need to make a pass through the C++20's Standard Formatting Library and through the libfmt (that @msoeken has referred me to) that we may or may not decide to adopt to implement the formatting functions. I believe (or even sure) that in some way we will still be able to make calls to the C++ implementation (even if for that I will need to pull the C++20's Standard Formatting Library down to our C++14 compiler), and in that case we will get the full power of formatting for cheap (but we will not have to commit to support the full formatting, we can still keep commitments minimal ;-).

have you looked at whether the proposed formatting is also implementable in C# with reasonable effort?

I need to look at this. If from C# we can forward the calls to C++ then we should be totally fine here. In that case I will just provide the same C++ implementation both for QIR and C#.

kuzminrobin commented 3 years ago

@cgranade,

stringly typed formatting specifications

(the letter 'i' in the word "stringly") in this particular context sounds very... equally-probable that you meant either "strongly typed" or "string typed"... ;-) I believe I still understand what you are saying. Thank you for clarification...

cgranade commented 3 years ago

A reference to a common (and likely unfair in this case!) riff on string-based APIs: http://wiki.c2.com/?StringlyTyped

kuzminrobin commented 3 years ago

(This idea has been abandoned. The reason is in the subsequent feedback. See later version below)

Single Formatted() function (or "member function" for strings and string literals)

Todo: Test in C++.

Library change

namespace Microsoft.Quantum.Convert {
    function Formatted(fmt : String, args: Tuple) : String { 
        body intrinsic; 
    }
}

Usage Examples

// The single `Formatted()` function:

let intVar = 42;
let doubleVar = 1234567.8901678;
let logStr = Formatted("{0:#x}, {1:010.2e}", 
                       (intVar, doubleVar)); // The programmer explicitly combines these args
                                             // into a single tuple.
// logStr = "0x2a, 001.23e+06"

let logStr = Formatted("{0:#x}", 
                       (intVar)); // For a single arg the conversion to tuple, 
                                  // i.e. extra parentheses, is still required. :-/
// logStr = "0x2a"

// Very unlikely option:
//let logStr = Formatted("{0:#x}, {1:010.2e}", 
//                       intVar, doubleVar); // These args (number of who is variable) are combined 
//                                           // by the compiler into a single tuple. 
//                                           // Requires the compiler/grammar change. Implies an implicit 
//                                           // conversion to tuple (which is very unlikely to be added to the language).
// The "member function" for strings and string literals:

// Requires compiler and/or grammar change.

let formatString = "{0:#x}, {1:010.2e}";
let logStr = formatString::formatted(intVar, doubleVar); // The "member function" for a string.
// logStr = "0x2a, 001.23e+06"
// Converted by the compiler to 
// let logStr = Formatted(formatString, (intVar, doubleVar));

let logStr = "{0:#x}, {1:010.2e}"::formatted(intVar, doubleVar); // The "member function" for a string literal.
// logStr = "0x2a, 001.23e+06"
// Converted by the compiler to 
// let logStr = Formatted("{0:#x}, {1:010.2e}", (intVar, doubleVar));

Nested and Run-Time-Dependent Formatting If we are formatting the output into a resizable entity (e.g. a window), during/upon the resizing the formatting requirements can change at runtime. In those cases the formatting options (such as width, precision) specified at compile time like this - "{1:010.2e}" - will not work. We will need the runtime values for such formatting options. Currently available workaround:

// Get the formatting string at runtime from the interpolated string literal:

let width = 10;  // Runtime value, can change in a loop or during/upon resizing.
let precision = 2;  // The same.
let runtimeFormatString = $"{1:0{width}.{precision}e}";
// runtimeFormatString = "{1:010.2e}"

// Todo: (Above) If 
//      the left-most `{` (immediately after `"`) 
//      and the right-most `}` (immediately before `"`) 
// confuse the compiler, then the possible workarounds are 
// $"\{1:0{width}.{precision}e\}";      // Escaped `{` and `}`.
// $"{{1:0{width}.{precision}e}}";      // Repeated `{` and `}`.
// "{1:0" + $"{width}.{precision}" + "}";    // Concatenation of strings.

// Use the runtime format string:
let logStr = Formatted(runtimeFormatString, (doubleVar));
// logStr = "001.23e+06"

(see String Literals). Looks more cumbersome than it could.

After the proposal is implemented, can be shortened to

let logStr = Formatted("{2:0{0}.{1}e}", 
                       (width, precision, doubleVar)); // {0} is `width`, {1} is `precision`, {2..} is `doubleVar`.
// logStr = "001.23e+06"
cgranade commented 3 years ago

(Work in progress. Can change)

In the same spirit as before, please take this as preliminary feedback, reflecting that this proposal is still in progress. Thanks!

Single Formatted() function (or "member function" for strings and string literals)

Library change

namespace Microsoft.Quantum.Convert {
    function Formatted(fmt : String, args: Tuple) : String { 
        body intrinsic; 
    }
}

I think you'd run into trouble here, in that Tuple isn't a Q# type; rather, you can construct tuple types using () to group one or more types together. E.g.: (Int, Double), (Qubit, (Int -> String), Bool) and (Pauli) are three distinct tuple types. From that perspective, you would have to say here which particular tuple type args should be.

Usage Examples

let intVar = 42;
let doubleVar = 1234567.8901678;
let logStr = Formatted("{0:#x}, {1:010.2e}", 
                       (intVar, doubleVar)); // The programmer explicitly combines these args
                                             // into a single tuple.
// logStr = "0x2a, 001.23e+06"

let logStr = Formatted("{0:#x}", 
                       (intVar)); // For a single arg the conversion to tuple, 
                                  // i.e. extra parentheses, is still required. :-/

Q# includes a concept called singleton–tuple equivalence, which states that 'T and ('T) are exactly the same type. It's not even a subtype or casting relationship, in that (Int) can be used anywhere that Int is used; they're just different ways of writing out the same exact type.

Similarly, (((((42))))) and 42 are exactly the same value, such that in order to pass intVar to an input of type (Int), the programmer is not required to use any additional parens.

// Very unlikely option:
//let logStr = Formatted("{0:#x}, {1:010.2e}", 
//                       intVar, doubleVar); // These args (number of who is variable) are combined 
//                                           // by the compiler into a single tuple. 
//                                           // Requires a compiler change. Implies an implicit 
//                                           // conversion to tuple (which is very unlikely to be added to the language)

At the moment, Q# doesn't include any sort of varargs feature; it seems like that could be difficult to merge with the tuple-in tuple-out semantics used by Q#, but if you want to kick off a suggestion as to how to do so, I'd suggest checking with @bettinaheim and then possibly opening a language issue with that suggestion.

kuzminrobin commented 3 years ago

I think you'd run into trouble here, in that Tuple isn't a Q# type

I assume that you mean the following:

I cannot write like this:

function Formatted(fmt : String, args: Tuple)

I have to write like one of these

function Formatted(fmt : String, (Int, Double) ) function Formatted(fmt : String, (Qubit, (Int -> String), Bool) ) function Formatted(fmt : String, (Pauli) )

Thanks for that timely warning!

cgranade commented 3 years ago

I think you'd run into trouble here, in that Tuple isn't a Q# type

I assume that you mean the following:

I cannot write like this:

function Formatted(fmt : String, args: Tuple)

I have to write like one of these

function Formatted(fmt : String, (Int, Double) ) function Formatted(fmt : String, (Qubit, (Int -> String), Bool) ) function Formatted(fmt : String, (Pauli) )

That's right, (Int, Double), (Qubit, (Int -> String), Bool) and (Pauli) are all different examples of tuple types; there's no one root type of all tuples, in part because type information is not kept at runtime in Q# (that is, there's no reflection).

Thanks for that timely warning!

No worries. If you're interested, the types section of the language guide at https://docs.microsoft.com/azure/quantum/user-guide/language/typesystem/ may be helpful here.

kuzminrobin commented 3 years ago

(Work in Progress. Likely to Change)

Proposed Change

Library Only Change. Is the easiest to implement. Proposed:

namespace Microsoft.Quantum.Convert {
    function FormattedI(fmt : String, value : Int) : String {    // For `Int` formatting.
        body intrinsic; 
    }
    function FormattedD(fmt : String, value : Double) : String {    // For `Double` formatting.
        body intrinsic; 
    }
}

The QIR implementation will pass the parameters to the C++ implementation.

Usage examples

Int

let intNum = 42;
let logString = FormattedI("{x}", intNum);   // String representation in hexadecimal form.
// let logString = "2a"

// Some more examples:
// Format String         logString
// "{x}"                 "2a"
// "{#x}"                "0x2a"
// "{#b}"                "0b101010"
// "{04X}"               "002A"
// "{#06X}"              "0X002A"
// "{#015_X}"            "0X000_0000_002A"
// "{0:x} 0x{0:04X}"     "2a 0x002A"
// "Result: {#x}"        "Result: 0x2a"
// See more examples in the subsequent sections.

// Use the formatted output:
open Microsoft.Quantum.Intrinsic;
Message(logString);    

Double

// Fixed point representation for Double:
let doubleNum = 1234567.8901678;   // Or NaN() or <infinity>.
let logString = FormattedD("{.4f}", doubleNum);  // String representation in fixed-point form.
// let logString = "1234567.8902"

// Some more examples:
// Format String      logString
// "{.4f}"            "1234567.8902"  // Or `nan` or `inf`.
// "{0:.4f}"          "1234567.8902"  // Or `nan` or `inf`. Equivalent to the one above.
// "{.4F}"            "1234567.8902"  // Or `NAN` or `INF`.
// "{012.2f}"         "001234567.89"  // Or `000000000nan` (Python) or `000000000inf` (Python).
// See more examples in the subsequent sections.

// Use the formatted output:
open Microsoft.Quantum.Intrinsic;
Message(logString);
// Scientific notation for Double:
let doubleNum = 1234567.8901678;   // Or NaN() or <infinity>.
let logString = FormattedD("{.4e}", doubleNum);
// let logString = "1.2346e+06"

// Some more examples:
// Format String      logString
// "{.4e}"            "1.2346e+06"  // Or `nan` or `inf`.
// "{.4E}"            "1.2346E+06"  // Or `NAN` or `INF`.
// "{010.2e}"         "001.23e+06"  // Or `0000000nan` (Python) or `0000000inf` (Python).
// See more examples in the subsequent sections.

// Use the formatted output:
open Microsoft.Quantum.Intrinsic;
Message(logString);

Nested and Run-Time-Dependent Formatting

If the formatted output is to be directed to the resizable entity (e.g. a window), during/upon the resizing the formatting requirements can change at runtime. In those cases the formatting options (such as width, precision) specified at compile time like this - "{010.2e}" - will not work. We need the runtime values for such formatting options. The solution is to generate the format string at runtime.

// Get the format string at runtime from the interpolated string literal:

let width = 10;  // Runtime value, can change in a loop or during/upon resizing.
let precision = 2;  // The same.
let runtimeFormatString = "{0" + $"{width}.{precision}" + "e}";
// runtimeFormatString = "{010.2e}"

// Use the runtime format string:
let logStr = FormattedD(runtimeFormatString, doubleVar);
// logStr = "001.23e+06"

See

Available Formatting Options and More Examples

To do: Test the corresponding formatting in C++.

cgranade commented 3 years ago

As per https://github.com/microsoft/QuantumLibraries/pull/426, we're good to deprecate existing formatting functions immediately; will respond in more detail on the proposal itself ASAP.

cgranade commented 3 years ago

Thanks for the update, @kuzminrobin! As promised, I've gone on and left some more feedback below, reflecting that as per your comment this is still in progress. In the meantime, I think my biggest take-away is that it would be good to understand the user feedback and usecases that motivate this proposal so as to help identify which subset of features we need to be sure to capture in the proposed API.

In particular, the previous formatting functions weren't heavily used at all (in fact, I was unable to find any use at all), speaking to that string formatting is either somewhat niche at this point, or our previous solution was rejected for some reason (e.g.: users weren't aware, didn't meet user needs, etc.).

Proposed Change

Library Only Change. Is the easiest to implement.

I think that makes sense, yeah; we may separately want to consider a language suggestion at some point, but I agree that for now, targeting library functionality makes the most sense.

Proposed:

namespace Microsoft.Quantum.Convert {
    function FormattedI(fmt : String, value : Int) : String {    // For `Int` formatting.
        body intrinsic; 
    }
    function FormattedD(fmt : String, value : Double) : String {    // For `Double` formatting.
        body intrinsic; 
    }
}

One thing that may be helpful (YMMV, of course) in continuing with the proposal is to prototype or stub what API docs would look like for these. That can sometimes show where things need to be specified a bit further. For example, under what conditions (if any) do these functions fail? What will these functions do on invalid formatting strings? E.g.: what would be returned by FormattedI("{x", 42)?

My own 2¢ is that string formatting should never result in a fail; we wouldn't want for an expensive quantum program to die for want of diagnostic output. To some extent, that reflects the different role that Message plays in Q# from how printf and std::cout get used in C and C++.

The QIR implementation will pass the parameters to the C++ implementation.

How would you suggest implementing on the C# side so that these new functions can be used with the existing C#-based runtime (e.g.: from Python via IQ#)? Would we want to write a parser for this mini-language in the C# runtime? It doesn't look too complicated, thankfully, but it's still an implementation detail that would be good to have some rough suggestion on to avoid this feature not being accessible by our Python users.

Usage examples

Int

let intNum = 42;
let logString = FormattedI("{x}", intNum);   // String representation in hexadecimal form.
// let logString = "2a"

// Some more examples:
// Format String         logString
// "{x}"                 "2a"
// "{#x}"                "0x2a"
// "{#b}"                "0b101010"
// "{04X}"               "002A"
// "{#06X}"              "0X002A"
// "{#015_X}"            "0X000_0000_002A"
// "{0:x} 0x{0:04X}"     "2a 0x002A"
// "Result: {#x}"        "Result: 0x2a"
// See more examples in the subsequent sections.

From this, it looks like the x and b suffixes are radix specifiers, while # specifies that the output should include the radix as a prefix? Would we ever want to support additional radixes in the future?

kuzminrobin commented 3 years ago

@cgranade, As always, thank you for sharing your incredible expertise and insight. I really appreciate and admire that.

(I realize that a reference to a video is a bad answer to a question ;-) But the answers in the video are provided by a person who invested years of his life in getting his formatting library proposal to the C++ language, being based on the existing Python implementation that was already working and widely used (adopted to Rust). And a number of people in the C++ standards committee invested multiple hours in reviewing/discussing/complementing the proposal, a number of companies adopted the C++ implementation/library. I doubt that my own answers or research will sound more convincing than that. To minimize the flaws of my answers, I refer to the minimal video fragments, and explain in text what the fragment is about.

The video mentioned is a talk at a C++ conference, introducing the C++ formatting library that ended up in C++20 standard (not yet implemented in latest Clang or GCC), the library that I propose to use for the implementation of Q# formatting (and a full credit for that definitely goes to @msoeken who referred me to that formatting library))


it would be good to understand the user feedback and usecases that motivate this proposal

Here I don't talk about whether Q# needs the formatting library at all or not. But if it does need, then we need to choose which one to adopt. \ disadvantages: summary table 9:23 - 10:30, mem safety: 3:19 - 5:20, performance: 7:48 - 9:20, extensibility for the user-defined types 10:30 - 12:00. C++ iostreams disadvantages: readability 12:36 - 13:16, sticky formatting flags: 14:43 - 15:40, locales: 16:00 - 17:30, output translation to other human languages (hardly applicable for us right now): 13:40 - 14:35, thread-unfriendly (hardly applicable for us right now): 19:12 - 21:12. Other libs: Boost Format performance, size, compile time: 22:45 - 23:25, Fast Format dead ends: 23:30 - 25:03.

Proposed library advantages: 25:30 - 26:22, safety (type/memory): 35:20 - 36:02, performance/size/compile-time: 42:03 - 48:00, 260+ contributors. Why New Syntax: 31:00 - 32:20, Why This Syntax (from Python, adopted to Rust): 34:10 - 35:20 Extensibility (user-defined formatting/types): 32:20 - 34:10, user's own formatting: 47:59 - 48:30, in progress: 48:30 - 49:20, new extension API: 49:30 - 51:05.

Unrelated to the answer, just for the future reference: Memory Management: 36:02 - 37:15 Usage Examples: 26:22 - 29:50 Grammar: 29:50 - 31:00


(My answers to the other your points to follow)

cgranade commented 3 years ago

To clarify slightly, I understand the usecases for formatting features like this in languages like Python and C++, but my point was that Q# gets used in a quite different context. To wit, I use string formatting in Python and C# quite regularly, but I was unable to find any usage of the previous Q# formatting functions.

To that extent, I think we can definitely learn from what formatting string specs are adopted in classical languages and why those specs were chosen, but the use cases here are likely to be somewhat different due to Q# programs running on quantum devices rather than at the host level.

kuzminrobin commented 3 years ago

Well, the use case that comes to my mind at the moment (while we have no fully fledged debugging facility for the Q#/IR/C++ intermixed code), is the detailed logging instead of debugging.