filipsajdak commented 1 year ago

In the current implementation of cppfront, the following code:

f2: (inout x) -> _ = {
    x *= 2;
    return x;
}

main: () -> int = {
    x := 21;
    std::cout << f2(x) << std::endl;
}

Passes cppfront:

tests/bug_inout_argument.cpp2... ok (all Cpp2, passes safety checks)

But failed to compile on the cpp1 side with an error:

tests/bug_inout_argument.cpp2:7:18: error: no matching function for call to 'f2'
    std::cout << f2(std::move(x)) << std::endl;
                 ^~
tests/bug_inout_argument.cpp2:1:20: note: candidate function [with x:auto = int] not viable: expects an lvalue for 1st argument
[[nodiscard]] auto f2(auto& x) -> auto{
                   ^
1 error generated.

When cppfront moves x on its last use it breaks the requirements of the f2 function that requires an lvalue reference but gets an rvalue reference.

Expectations

cppfront should take into consideration the context in which the variable is used to avoid breaking the code on the cpp1 side.

hsutter commented 1 year ago

Thanks!

One way to suppress this could be to require a mutating argument to be qualified with inout, which I've thought of and @jbatez suggested in #198.

I see there's a related issue in #230 which might also be solved with an inout argument (call-site) requirement...

I'll consider #230 and #231 together...

hsutter commented 1 year ago

Thanks for picking this up again in #294.

tldr

After reconsidering the examples, I think the status quo is a feature, not a bug, in Cpp2. I think the combination of parameter passing + move from definite last use is (elegantly? certainly naturally) exposing real user code bugs that were silent in Cpp1. This is very pleasing.

That said, I agree that an argument qualifier is the right answer. But understanding why the status quo is actually a feature is important because it will:

help us name the qualifier,
show where else the qualifier should be allowed,
show why the qualifier will be needed rarely, and
show why when the qualifier is needed it adds value (and a "programmer doing something odd here" flag to focus on during code review).

Why a feature: Diagnosing an unused output, like `[[nodiscard]]`

There are two features interacting here:

(1) Declaring parameter passing intent: This states the direction of data flow (in, inout, etc.).

(2) Move from definite last use: When we know the variable won't be used again, of course it's safe to move from so it seems this should be automatic and default.

Both features let the programmer declare their intent in a way that helps expose program bugs. Specifically:

(1) An inout or out (or Cpp1 non-const &) parameter is declaring that one of the function's outputs is via that argument, just as declaring a non-void return type is declaring that one of the function's outputs is via the return value. Those are the outputs, and ignoring an output is usually bad (but of course not always, see bottom). Just as Cpp2 makes [[nodiscard]] the default for return values, what you're encountering here is that it is naturally doing the same thing for inout arguments too, treating both declared output data flows similarly.

(2) A last use argument is diagnosing that the variable will no longer be used. If the last use is to an inout or out parameter, then not looking at it afterward is just the same as calling a function with a non-void return and never looking at the returned value (which is diagnosed in Cpp2 because of the enforced [[nodiscard]]).

So we are doing the user a favor by diagnosing this, just the same as if the user were ignoring a [[nodiscard]] return value.

And that's why I think that we should consider naming the opt-out for "unused out result" and "unused return value" with the same name, if there's a good name. They are the same case. (Sure, you sometimes want an opt-out, but only in rarer cases where you're relying on other side effects being performed by the function and really don't need the value, in which case the code should say so by writing discard or something.)

Example 1: Just `return`

Let's consider the two versions of the code... First, consider the version of the code you used in #294:

f2: (inout x) -> _ = {
    return x * 2;
}

main: () = {
    x := 21;
    std::cout << f2(x) << std::endl;
}

Compiling this with cppfront and then a Cpp1 compiler calls out f2(x) as invalid. But why? The compilers tell us it's because x is an rvalue, and the argument must be an lvalue. This is great, because it's true. There's something fishy.

What's fishy? It's f2... it declares its parameter as inout, but never writes to it. As you know, I aim eventually (not now) to emit a diagnostic for failure to have a non-const use of an inout parameter on at least one path somewhere in the function... when I implement that, the error will be flagged even sooner within the callee. Right now, the error is being flagged at the call site, which I expect to be usually still caught early at f2 unit test time because it will be common for even f2's initial toy test cases to do this... pass a last use, which exposes the bug in f2.

What's the solution? In this case, f2 should change its parameter to be in, and then everything compiles and runs.

Example 2: Also modify parameter

But what if f2 actually modifies its parameter? That brings us to the other version of your code, above...

f2: (inout x) -> _ = {
    x *= 2;
    return x;
}

main: () = {
    x := 21;
    std::cout << f2(x) << std::endl;
}

Again we get the error flagged, but this time the problem is at the call site, f2 is okay.

Consider why f2 is okay: Even though f2 is a little odd for redundantly emitting the same output value in two different output return paths (the inout argument and the return value), that's not wrong per se, and might be useful for chaining or whatever. So f2 is fine this time, in the sense that it's doing what it declared it would do... it's writing to its input argument, and it's returning a value.

But now the call site is definitely suspicious because it's making a call that is declared to modify its argument, but then never looks at the argument again. Ignoring an output is usually bad, at least by default.

Naming the opt-out

So I view this as a great feature of Cpp2... by:

declaring the parameter passing direction, and
moving from definite last use,

we naturally and automatically diagnosed failure to use an output. I like that a lot.

Furthermore, this is just like [[nodiscard]]. In both cases, we want an opt-out. But what's the right name? Given:

inout_func: ( inout x ) = { /*...*/ }
returning_func: () -> _ = { /*...*/ }

Then consider this call site, where we want an explicit opt-out, and ideally the same word of power in both places since they're opting out of conceptually the same thing:

{
    x := 42;
    inout_func( SOMETHING x );
    (SOMETHING returning_func());
}

I want to think about the naming some more, but as a start I'm not sure inout works well for both:

//  What if "SOMETHING" were "inout"? Doesn't feel quite right...
{
    x := 42;
    inout_func( inout x );     // inout works pretty well here
    (inout returning_func());  // but not so well here
}

On the other hand, "discard" gives a nice first impression, and is symmetric with [[nodiscard]] and could connote "don't do anything special with, including don't move its guts along" as well as "discard this thing's value, I'm not going to use it from here onward":

//  What if "SOMETHING" were "discard"? I think I like it... "discard this value, I'm not going to use it after here"
{
    x := 42;
    inout_func( discard x );     // that word is a big red code review flag (good)
    (discard returning_func());  // and here with a clear meaning
}

It seems right to use the same opt-out word for unused inout/out arguments and unused return values. Getting the name right is important, though. This is something I want to sleep on further, but there's my brain dump for today. Thanks again.

filipsajdak commented 1 year ago

@hsutter Thank you for this summary - I think you synthesize it very well.

I agree that this is a similar thing as [[nodiscard]], and not using the inout or out argument is suspicious at the minimum.

I like the

discard returning_func();

But using it next to the function argument looks suspicious:

inout_func( discard x );

My first impression is that we want to discard the x variable - unfortunately, it is on the call side before it gets to the function. It could be misinterpreted as something will happen to x before a call to inout_func... or maybe it is just me.

Maybe we can add a passing style to clarify:

inout_func( discard inout x ); // maybe `discard out x` to emphasize that we discard output of the x

Another keyword to consider is unused:

x := 42;
inout_func( unused x );
(unused returning_func());

But still I would prefer to add a passing style:

x := 42;
inout_func( unused inout x ); // or unused out x
(unused returning_func());

SebastianTroy commented 1 year ago

Totally agree with all of this, discard does indeed feel like a good choice given the nodiscard symmetry.

I have two(three) questions,

Why not use [[discard]] instead of adding a new keyword? (I'm not against using just discard, merely curious, in fact is cpp2 avoiding the [[ xyz ]] syntax altogether?)

Would it be reasonable to decorate a parameter with multiple passing intentions, i.e. in_or_inout_func: ( in|inout x ) = { /*..

Suggesting that the parameter's side effect is not mandatory and therefore not worth warning when the user doesn't use it?

On 26 March 2023 22:06:18 Herb Sutter @.***> wrote:

Thanks for picking this up again in #294https://github.com/hsutter/cppfront/pull/294.

tldr

After reconsidering the examples, I think the status quo is a feature, not a bug, in Cpp2. I think the combination of parameter passing + move from definite last use is (elegantly? certainly naturally) exposing real user code bugs that were silent in Cpp1. This is very pleasing.

That said, I agree that an argument qualifier is the right answer. But understanding why the status quo is actually a feature is important because it will:

help us name the qualifier,
show why the qualifier will be needed rarely, and
show why when the qualifier is needed it adds value (and a "programmer doing something odd here" flag to focus on during code review).

Why a feature: Diagnosing an unused side effect, like [[nodiscard]]

There are two features interacting here:

(1) Intentional parameter passing: This states the direction of data flow (in, inout, etc.).

(2) Definite move from last use: When we know the variable won't be used again, of course it's safe to move from so it seems this should be automatic and default.

Both features let the programmer declare their intent in a way that helps expose program bugs. Specifically:

(1) An inout (or Cpp1 non-const &) parameter is declaring that one of the function's outputs is via that argument, just as declaring a non-void return type is declaring that one of the function's outputs is via the return value. Those are the side effects, and ignoring a side effect is usually bad Just as Cpp2 makes [[nodiscard]] the default for return values, it is effectively doing the same thing for inout arguments too, treating both declared side effects similarly.

(2) A last use argument is diagnosing that the variable will no longer be used. If the last use is to an inout or out parameter, then not looking at it afterward is just the same as calling a function with a non-void return and never looking at the returned value (which is diagnosed in Cpp2 because of the enforced [[nodiscard]]).

So we are doing the user a favor by diagnosing this, just the same as if the user were ignoring a [[nodiscard]] return value.

And that's why I think that we should consider naming the opt-out for "unused out result" and "unused return value" with the same name, if there's a good name. They are the same case. (Sure, you sometimes want an opt-out, but only in rarer cases where you're relying on other side effects being performed by the function and really don't need the value, in which case the code should say so by writing discard or something.)

Example 1: Just return

Let's consider the two versions of the code... First, consider the version of the code you used in #294https://github.com/hsutter/cppfront/pull/294:

f2: (inout x) -> _ = { return x * 2; }

main: () -> int = { x := 21; std::cout << f2(x) << std::endl; }

Compiling this with cppfront and then a Cpp1 compiler calls out f2(x) as invalid. But why? The compilers tell us it's because x is an rvalue, and the argument must be an lvalue. This is great, because it's true. There's something fishy.

What's fishy? It's f2... it declares its parameter as inout, but never writes to it. As you know, I aim eventually (not now) to emit a diagnostic for failure to have a non-const use of an inout parameter on at least one path somewhere in the function... when I implement that, the error will be flagged even sooner within the callee. Right now, the error is being flagged at the call site, which I expect to be usually still caught early at f2 unit test time because it will be common for even f2's initial toy test cases to do this... pass a last use, which exposes the bug in f2.

What's the solution? In this case, f2 should change its parameter to be in.

But what if f2 actually modifies its parameter? That brings us to the other version of your code, above...

Example 2: Also modify parameter

Second, consider the variation you posted above:

f2: (inout x) -> _ = { x *= 2; return x; }

main: () -> int = { x := 21; std::cout << f2(x) << std::endl; }

Again we get the error flagged, but this time the problem is at the call site, f2 is okay.

Consider why f2 is okay: Even though f2 is a little odd for redundantly emitting the same output value in two different output return paths (the inout argument and the return value), that's not wrong per se, and might be useful for chaining or whatever. So f2 is fine this time, it's doing what it declared it would do... it's writing to its input argument, and it's returning a value.

But now the call site is definitely suspicious because it's making a call that is declared to modify its argument, but then never looks at the argument again. Ignoring a side effect is usually bad.

Naming the opt-out

So I view this as a great feature of Cpp2... by:

declaring the parameter passing direction, and
moving from definite last use, we naturally and automatically diagnosed failure to use a side effect. I like that a lot.

Furthermore, this is just like [[nodiscard]]. In both cases, we want an opt-out. But what's the right name? Consider, given:

inout_func: ( inout x ) = { /.../ } returning_func: () -> T = { /.../ }

Then we have this call site:

{ x := 42; inout_func( SOMETHING x ); SOMETHING returning_func(); }

I want to think about that some more, but I'm not sure inout works well for both:

// What if "SOMETHING" were "inout"? Doesn't feel quite right... { x := 42; inout_func( inout x ); // inout works pretty well here inout returning_func(); // but not so well here }

On the other hand, "discard" gives a nice first impression, and is symmetric with [[nodiscard]] and could connote "don't do anything special with, including don't move its guts along" as well as "discard this thing's value, I'm not going to use it from here onward":

// What if "SOMETHING" were "discard"? I think I like it... "discard this value, I'm not going to use it after here" { x := 42; inout_func( discard x ); // that word is a big red code review flag (good) discard returning_func(); // and here with a clear meaning }

It seems right to use the same opt-out word for unused inout/out arguments and unused return values. Getting the name right is important, though. This is something I want to sleep on further, but there's my brain dump for today. Thanks again.

— Reply to this email directly, view it on GitHubhttps://github.com/hsutter/cppfront/issues/231#issuecomment-1484224112, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AALUZQIDQWDARPE6W44CL23W6CVURANCNFSM6AAAAAAT5JDQ2Y. You are receiving this because you are subscribed to this thread.Message ID: @.***>

AbhinavK00 commented 1 year ago

Agreed with Herb's analysis on why this is actually great. Small things I want to point out from @SebastianTroy's comment:

Why not use [[discard]] instead of adding a new keyword?

I have the same question, an attribute seems like the right fit for this job instead of "some new keyword popping out of nowhere".

Would it be reasonable to decorate a parameter with multiple passing intentions, i.e. in_or_inout_func: ( in|inout x ) = { /*..

I have a question related to this, will we able to mark certain parameters in the function body as discard, this could be another way of doing the same thing. So, our example would become

f2: ([[discard]] inout x) -> _ = {
    x *= 2;
    return x;
}

main: () = {
    x := 21;
    std::cout << f2(x) << std::endl;
}

This would be a way to signify that the mutations it makes to x could be discarded, so cppfront would simply do a static cast of any rvalue before passing to such functions or not emit std::move at all.

gregmarr commented 1 year ago

I like the idea but also don't think discard makes sense as a parameter decoration, as you aren't discarding the entire thing, just the returned information, so maybe something like discard_return or discard_result. I think that would apply just as well to the return value of the function.

hsutter commented 1 year ago

Another keyword to consider is unused:
x := 42;
inout_func( unused x );
(unused returning_func());

That's a good candidate. Thoughts:

Primarily, it has a connection with "definite last 'use'". The idea we want to convey is that there would be more "use" of x later in the scope, but the annotation on this use is that that later last use explicitly being omitted... "unused" might connote that, but maybe less strongly than "discard" or "discard_result"?
Also, it has a symmetry with Cpp1 [[maybe_unused]].

Why not use [[discard]] instead of adding a new keyword? (I'm not against using just discard, merely curious, in fact is cpp2 avoiding the [[ xyz ]] syntax altogether?)

Cpp2 is currently using [[ ]] only for contracts, and I might change that too. In Cpp1 we spell some things as attributes, in part for syntax compatibility constraints which don't apply in Cpp2.

Would it be reasonable to decorate a parameter with multiple passing intentions, i.e. in_or_inout_func: ( in|inout x ) = { /*.. Suggesting that the parameter's side effect is not mandatory and therefore not worth warning when the user doesn't use it?

If we want to express that an output (parameter out data flow, or non-void return value) is discardable, we should have a consistent way to say that and again I would like to use the same word in both places.

For example:

inout_func_with_ignorable_result: ( ~SOMETHING inout x ) = { /*...*/ }
returning_func_with_ignorable_result: () -> ~SOMETHING _ = { /*...*/ }

I tag this as ~SOMETHING because it probably wants to be the inverse of the above SOMETHING.

Putting it together:

inout_func: ( inout x ) = { /*...*/ }
returning_func: () -> _ = { /*...*/ } 
inout_func_with_ignorable_result: ( ~SOMETHING inout x ) = { /*...*/ }
returning_func_with_ignorable_result: () -> ~SOMETHING _ = { /*...*/ }

{  // call site
    x := 42;
    inout_func( SOMETHING x );
    (SOMETHING returning_func());
}

Trying out @gregmarr's discard_result suggestion:

inout_func: ( inout x ) = { /*...*/ }
returning_func: () -> _ = { /*...*/ } 
inout_func_with_ignorable_result: ( discardable_result inout x ) = { /*...*/ }
returning_func_with_ignorable_result: () -> discardable_result _ = { /*...*/ }

{  // call site
    x := 42;
    inout_func( discard_result x );
    (discard_result returning_func());
}

That looks fairly decent at first blush. Clear, and a little verbose which is a good thing for an explicitly lossy escape hatch that we want to stand out. (Syntax colorizer writers, feel free to make it red... :) )

Trying out @filipsajdak's unused suggestion:

inout_func: ( inout x ) = { /*...*/ }
returning_func: () -> _ = { /*...*/ } 
inout_func_with_ignorable_result: ( maybe_unused inout x ) = { /*...*/ }
returning_func_with_ignorable_result: () -> maybe_unused _ = { /*...*/ }

{  // call site
    x := 42;
    inout_func( unused x );
    (unused returning_func());
}

This looks nice on the call site, but I worry that on the parameter it could imply that the name is not used in the callee body, which is what Cpp1 [[maybe_unused]] does.

Trying out a merger of the two, even more verbose on the declarations but again this is a case where verbosity can be a plus:

inout_func: ( inout x ) = { /*...*/ }
returning_func: () -> _ = { /*...*/ } 
inout_func_with_ignorable_result: ( maybe_unused_result inout x ) = { /*...*/ }
returning_func_with_ignorable_result: () -> maybe_unused_result _ = { /*...*/ }

{  // call site
    x := 42;
    inout_func( unused_result x );
    (unused_result returning_func());
}

Will think some more...

filipsajdak commented 1 year ago

@hsutter I like the way you make a synthesis of the proposed ideas.

Looking at the last one:

inout_func: ( inout x ) = { /*...*/ }
returning_func: () -> _ = { /*...*/ } 
inout_func_with_ignorable_result: ( maybe_unused_result inout x ) = { /*...*/ }
returning_func_with_ignorable_result: () -> maybe_unused_result _ = { /*...*/ }

{  // call site
    x := 42;
    inout_func( unused_result x );
    (unused_result returning_func());
}

How will it interact with move-of-last-use?

    x := 42;
    inout_func( unused_result x ); // will it just suppress the move?

And when we would define a function with maybe_unused_result inout:

    x := 42;
    inout_func_with_ignorable_result( x ); // will it just suppress the move?

Will it change the function's signature or add unused_result on the call side by default? Is this good or bad?

I like the focus on the intention and would like to know if we shall support defining functions in that way. I feel comfortable with -> maybe_unused_result _ on the return side, but having that on the inout argument feels like trying to fix some wrong design decision. Is there a use case where we use such an approach in the current cpp1 code?

But now the call site is definitely suspicious because it's making a call that is declared to modify its argument but then never looks at the argument again. Ignoring an output is usually bad, at least by default.

I like the above way of thinking, and for sure, I need to fix some cpp2 code just because cppfront complains about ignoring the return value from a function. Please note that I use the term FIX as, after second thought, my code was just somehow broken. I don't know if providing an easy way to opt out of this rule on the definition side is a good thing.

I like the idea of being explicit when something odd is going on. Ignoring output from a function is an odd thing that you might want to do, which is why you should have the possibility to add unused_result on the call side. That will focus the attention of the code reader.

Having the same thing on the definition side and not requiring anything on the call side will make things (from that perspective) worse, as when you read code, you don't check function definitions all the time - that might mislead the reader.

JohelEGP commented 1 year ago

Would it be reasonable to decorate a parameter with multiple passing intentions, i.e. in_or_inout_func: ( in|inout x ) = { /*.. Suggesting that the parameter's side effect is not mandatory and therefore not worth warning when the user doesn't use it?

If we want to express that an output (parameter out data flow, or non-void return value) is discardable, we should have a consistent way to say that and again I would like to use the same word in both places.
inout_func_with_ignorable_result: ( maybe_unused inout x ) = { /*...*/ }
This looks nice on the call site, but I worry that on the parameter it could imply that the name is not used in the callee body, which is what Cpp1 [[maybe_unused]] does.

f: (in out? x) = { /*.. could be an alternative spelling. Although that fails to meet this:

If we want to express that an output (parameter out data flow, or non-void return value) is discardable, we should have a consistent way to say that and again I would like to use the same word in both places.

Alternatively, consider using the most appropriate spelling for a given context. This might be useful if consistency isn't convincing enough, and to help find a middle ground.

inout_func: ( inout x ) = { /*...*/ }
returning_func: () -> _ = { /*...*/ } 
inout_func_with_ignorable_result: ( in out? x ) = { /*...*/ }
inout_func_with_ignorable_result: ( in maybe_out x ) = { /*...*/ }
returning_func_with_ignorable_result: () -> out? _ = { /*...*/ }
returning_func_with_ignorable_result: () -> maybe_unused _ = { /*...*/ }

{  // call site
    x := 42;
    inout_func( not out x );
    inout_func( in x ); // "Force the `in`, ignore the `out`".
    (unused returning_func());
    (void returning_func());
}

JohelEGP commented 1 year ago

When I thought of the alternative above, it occurred to me that

inout_func_with_ignorable_result: ( maybe_unused_result inout x ) = { /*...*/ }

is to

inout_func_with_ignorable_result: ( in maybe_out x ) = { /*...*/ }

what if (not irreversible) is to if (reversible). We want to say the latter. But there's no direct way to say it. So we have to add to what was said to make it what is actually wanted.

hsutter commented 1 year ago

I think out? or maybe_out would imply more that the callee body might or might not produce an output value. To some extent inout already accounts for that side of things, with the intended semantics of "write to this on at least one code path."

I think what we're looking at here is the complement of that -- not whether the callee will change the argument's value to emit a new output value, but whether the caller should view the output as important vs. can safely ignore it.

Trying out "ignore"...

inout_func: ( inout x ) = { /*...*/ }
returning_func: () -> _ = { /*...*/ } 
inout_func_with_ignorable_result: ( ignorable_result inout x ) = { /*...*/ }
returning_func_with_ignorable_result: () -> ignorable_result _ = { /*...*/ }

{  // call site
    x := 42;
    inout_func( ignore_result x );
    (ignore_result returning_func());
}

Or with "output", and using a "can" prefix to avoid dealing with English verb-to-adjective conventions (e.g., wherever possible I'd like to avoid non-English speakers having to learn conventions like "ignore" -> "ignorable" to program in Cpp2)...

inout_func: ( inout x ) = { /*...*/ }
returning_func: () -> _ = { /*...*/ } 
inout_func_with_ignorable_result: ( can_ignore_output inout x ) = { /*...*/ }
returning_func_with_ignorable_result: () -> can_ignore_output _ = { /*...*/ }

{  // call site
    x := 42;
    inout_func( ignore_output x );
    (ignore_output returning_func());
}

AbhinavK00 commented 1 year ago

Is there a use case where we use such an approach in the current cpp1 code?

I have the same question, is there even a use-case for this? We can try to not implement this feature now and maybe implement it later if actual use cases are encountered. Even if we do encounter an use case, I would argue that annotations are only needed at the call site and not in function definitions.

gregmarr commented 1 year ago

To make sure I understand, currently all function returns are converted to Cpp1 as [[nodiscard]], and the can_ignore_output decoration is to suppress that?

Is the intent on the call site that the SOMETHING has to be used like this: (SOMETHING returning_func()) as opposed to just SOMETHING returning_func()? Is there a parsing issue that requires the parens, or is it intended as clarification for the user?

hsutter commented 1 year ago

I feel comfortable with -> maybe_unusedresult on the return side, but having that on the inout argument feels like trying to fix some wrong design decision. Is there a use case where we use such an approach in the current cpp1 code?

It's the same use/bug case. A caller ignoring a return value output is a well known source of a family of security vulnerabilities: CWE-252 is a general category, and then there are more specific categories under it. It's the same bug if the caller ignores an argument output, if the function happens to choose to produce an output via a modified argument instead of (or in addition to) the return value.

For example:

Allocation functions: malloc returns a pointer that should be checked before use, whereas COM object allocation returns the allocated pointer via an Object** parameter which is today's spelling for a Cpp2 out unique_ptr<Object> parameter.
std::error_code-using functions: Many functions return an error_code by value. Others return them via an inout or out parameter, such as a lot of filesystem functions like bool is_character_file( const std::filesystem::path& p, std::error_code& ec ) noexcept;. A returned error_code should be checked regardless of which way the function happened to return it.

is_character_file is an example of the above inout_func whose "out" parameter result should not be ignored. But whereas we're getting better at diagnosing failure to look at the return value because of linter tools and [[nodiscard]], we're not yet as good at diagnosing failure to look at the output via "out" parameter.

Today we have a patchwork of narrow solutions:

[[nodiscard]] for return values (but it can't be made the default in Cpp1, so we're adding it on the vast majority of std:: value-returning functions because it should be the default)
std::ignore for a subset of cases
proposals like P1881 proposing [[discardable]] (if we could make [[nodiscard]] the default)
(void) as a de facto convention for spelling "ignore this value"

Cpp2 already has the right consistent automatic defaults so that we never need to write anything for the majority of cases: [[nodiscard]] is already the automatic default for function return values, and detecting failure to use the result of an inout or out parameter is already the automatic default (that's what spawned this thread). So most of the time we don't need to write anything.

Now we're discussing the right consistent opt-out, aiming for a single consistent answer to avoid piecemeal patches like a [[discardable]] here and a std::ignore there.

To make sure I understand, currently all function returns are converted to Cpp1 as [[nodiscard]], and the can_ignore_output decoration is to suppress that?

Yes.

Is the intent on the call site that the SOMETHING has to be used like this: (SOMETHING returning_func()) as opposed to just SOMETHING returning_func()? Is there a parsing issue that requires the parens, or is it intended as clarification for the user?

Those parens are currently required because I happen to only allow argument modifiers in expression lists. Having to write ( ) around them hasn't bothered me enough yet to parse them also as prefix operators, but I could do that and then the parens would not be required around single expressions.

gregmarr commented 1 year ago

Those parens are currently required because I happen to only allow argument modifiers in expression lists. Having to write ( ) around them hasn't bothered me enough yet to parse them also as prefix operators, but I could do that and then the parens would not be required around single expressions.

Sounds good.

AbhinavK00 commented 1 year ago

Val has a feature to discard return values of functions by assigning them to a placeholder underscore like this:


_ = returning_func();

This effectively discards the return value, but I can't think of a way to extend it to inout arguments.

gregmarr commented 1 year ago

Go also does that, and I thought of mentioning that, but it also has the same issue of not being extendable to inout. I think we discussed that for returns somewhere at some point.

JohelEGP commented 1 year ago

It's the same use/bug case.

I think what https://github.com/hsutter/cppfront/issues/231#issuecomment-1485910734 asked, I upvoted, and https://github.com/hsutter/cppfront/issues/231#issuecomment-1486202086 agreed with, was the opposite. Whether there's value in giving power to the callee to determine that an out parameter is ignorable. And that there's value in the caller always having to opt-out. Your answer suggests that in your examples an out argument shouldn't be ignored by default. There does not seem to be an example of an out parameter that the caller doesn't inspect and doesn't need to opt-into ignoring it.

Granted, there's still value in the discussion, to determine a consistent opt-out for parameters and arguments. In case it's ever needed.

hsutter commented 1 year ago

Ah, got it -- I see the question was just about the parameter being able to declare its output is ignorable, and whether there are use cases for such parameters. Thanks.

I suspect the pattern of the answer will be the same: Declaring 'this output is ignorable' is uncommon for return values, but in the cases where you want to declare an ignorable return value you would also want to declare an ignorable output value if the function author chose that as the path to deliver the output. But I don't have a concrete example in hand, and having one would be helpful.

gregmarr commented 1 year ago

I've seen many APIs with Foo ** parameters where if the argument is null, then it's ignored and not populated, and otherwise, it sets the Foo* to an output value. There are many of those in the Win32 API. That would to me correspond to an ignorable out parameter.

https://learn.microsoft.com/en-us/windows/win32/api/winreg/nf-winreg-regqueryvalueexw

LSTATUS RegQueryValueExW(
  [in]                HKEY    hKey,
  [in, optional]      LPCWSTR lpValueName,
                      LPDWORD lpReserved,
  [out, optional]     LPDWORD lpType,
  [out, optional]     LPBYTE  lpData,
  [in, out, optional] LPDWORD lpcbData
);

JohelEGP commented 1 year ago

I wonder if it's possible to come up with an example of an ignorable inout/out parameter using pointers.

An inout maps to a reference. Supposing inout x: Foo** works, is x* = &foo a write to x? I'd expect the answer to be no.

Foo** parameters and the like are the target of C++23 "Smart pointer adaptors".

AbhinavK00 commented 1 year ago

I've seen many APIs with Foo * parameters where if the argument is null, then it's ignored and not populated, and otherwise, it sets the Foo to an output value. There are many of those in the Win32 API. That would to me correspond to an ignorable out parameter.

To me, that sounds like a normal inout argument, keep in mind that function has to write to the reference only atleast in one control path.

The only use case I could think is of a function which produces output via both parameters and return value but you just call it for one of those output (for whatever reason) and therefore you'd have to ignore the other one. For example:


func : (inout x : std::string ) -> std::string = {
  x = "done";
  return x;
}

main : () ={
  a : std::string = "test";
  std::cout << func(a);
}

Here, you're calling func just for it's return value and passing a to it just because you happened to have a variable you won't use again. Same can be said the other way around, ignoring return value just because you wanted mutation on your passed argument.

In both cases, it's the callee which decides to use the outputs so I'd say it's not at all needed for the function to say that its output can be ignored, it should be upto the callee only.

gregmarr commented 1 year ago

I've seen many APIs with Foo * parameters where if the argument is null, then it's ignored and not populated, and otherwise, it sets the Foo to an output value. There are many of those in the Win32 API. That would to me correspond to an ignorable out parameter.

To me, that sounds like a normal inout argument, keep in mind that function has to write to the reference only atleast in one control path.

In a normal out or inout, as you must provide a valid variable. For an optional parameter, it's allowed to be null. This is more complicated, but it's an example of a large set of APIs. I don't know if that's something that we should say "you can't write this in Cpp2 because it's not safe" or if it's something that we should figure out how to support.

AbhinavK00 commented 1 year ago

Oh ok, I'm not familiar with that but that still sounds like a problem related to null handling.

filipsajdak commented 1 year ago

I will show some code demonstrating the issue we are discussing here.

t2 : type = {
    x : *int;

    operator=:(out this, p : *int) = {
        x = p;
    }

    ptr1: (inout this, p : *int) -> *int = std::exchange(x, p);
    ptr2: (inout this, inout p : *int) = std::swap(x, p);
}

main :() = {
    n := 42;
    a : t2 = (n&);
    m := 24;
    a.ptr1(m&); // return value can be ignored; it might be unused

    pn := n&;
    a.ptr2(pn); // out of pn can be ignored; it might be unused;
}

The t2::ptr1() function is similar to the:

std::basic_streambuf<CharT, Traits>* std::basic_ios<CharT,Traits>::rdbuf( std::basic_streambuf<CharT, Traits>* sb );

(you can check it here: https://en.cppreference.com/w/cpp/io/basic_ios/rdbuf)

I did not find an example for the t2::ptr2() case, but I will look more. I feel a little awkward about it, but it is correct (please note that all pointers here are non-owning, so we can ignore the return values safely).

Scenario 1 - `ignore_output` on the call side.

In this case, we can write:

main :() = {
    n := 42;
    a : t2 = (n&);
    m := 24;
    (ignore_output a.ptr1(m&);)

    pn := n&;
    a.ptr2(ignore_output pn);
}

I like that - we explicitly inform the code reader that this function returns something that is ignored - it might change in further code development, and it is good to see it in the place where it happens. What is good is that the code expresses what I previously put into comments - that is perfect!

Scenario 2 - `can_ignore_output` on the function definition side

As we know that we can safely ignore the return value, we can change the class to:

t2 : type = {
    x : *int;

    operator=:(out this, p : *int) = {
        x = p;
    }

    ptr1: (inout this, p : *int) -> can_ignore_output *int = std::exchange(x, p);
    ptr2: (inout this, can_ignore_output inout p : *int) = std::swap(x, p);
}

And then the main() will look like the following:

main :() = {
    n := 42;
    a : t2 = (n&);
    m := 24;
    a.ptr1(m&);

    pn := n&;
    a.ptr2(pn);
}

It is correct and safe but misleading the code reader. Taking cpp2 defaults into account, I would assume that a.ptr1() does not return anything as, by default, all functions are [[nodiscard]]. Also, I would assume that a.ptr2() is using in passing style as it is accepted, and this is a definite-last-use of pn, so it will be moved.

Summary

Both scenarios are correct. But Scenario 1 is more explicit and Scenario 2 is more misleading the reader.

Edit: added inout this, pointed out by @AbhinavK00

AbhinavK00 commented 1 year ago

I've been saying that annotations will only be needed on callee side and @filipsajdak clearly shows that in his example. I think the correct way forward would be to implement scenario 1 as shown in the example.

Btw, shouldn't the member functions t2::ptr1 and t2::ptr2 have this parameter? And can we omit return too? (like in t2::ptr1)

filipsajdak commented 1 year ago

@AbhinavK00 yes, you are correct it needs this, more specifically inout this - I will correct it.

filipsajdak commented 1 year ago

@AbhinavK00, the return is added correctly by cppfront. The generated code looks like the following:

#line 1 "/Users/filipsajdak/dev/execspec/external/tests/inout_ptr_example.cpp2"
class t2   {
    private: int* x; 

    public: explicit t2(cpp2::in<int*> p)
        : x{ p }
#line 4 "/Users/filipsajdak/dev/execspec/external/tests/inout_ptr_example.cpp2"
                                   {

    }
#line 4 "/Users/filipsajdak/dev/execspec/external/tests/inout_ptr_example.cpp2"
    public: auto operator=(cpp2::in<int*> p) -> t2& {
        x = p;
        return *this;
#line 6 "/Users/filipsajdak/dev/execspec/external/tests/inout_ptr_example.cpp2"
    }

    public: [[nodiscard]] auto ptr1(cpp2::in<int*> p) -> int* { return std::exchange(x, p); }
    public: auto ptr2(int*& p) -> void { std::swap(x, p); }
};

auto main() -> int{
    auto n {42}; 
    t2 a {&n}; 
    auto m {24}; 
    CPP2_UFCS(ptr1, a, &m);// return value can be ignored; it might be unused

    auto pn {&n}; 
    CPP2_UFCS(ptr2, std::move(a), std::move(pn));// out of pn can be ignored; it might be unused;
}

gregmarr commented 1 year ago

Oh ok, I'm not familiar with that but that still sounds like a problem related to null handling.

LSTATUS RegQueryValueExW(
  [in]                HKEY    hKey,
  [in, optional]      LPCWSTR lpValueName,
                      LPDWORD lpReserved,
  [out, optional]     LPDWORD lpType,
  [out, optional]     LPBYTE  lpData,
  [in, out, optional] LPDWORD lpcbData
);

I would say that the canonical way to write this in Cpp2 would be something like this (removing the unused reserved parameter):

RegQueryValueExW: (
  hKey: int,
  valueName: optional<string_view>,
  out type: optional<DWORD>,
  out data: optional<BYTE>,
  inout cbData: optional<vector<DWORD>>
) -> LRESULT = 
{
...
}

This would require that you create variables to accept the type, data, and cbData values, even if you don't want them.

What if instead it were like this and this allowed you to pass null for type, data, and cbData, that would be closer to the original API. (Using OPTIONAL here as a keyword placeholder.)

RegQueryValueExW: (
  hKey: int,
  OPTIONAL valueName: string_view,
  OPTIONAL out type: DWORD,
  OPTIONAL out data: BYTE,
  OPTIONAL inout cbData: vector<DWORD>
) -> LRESULT = 
{
...
}

To be safe, the function would be required to check the out and inout parameters for null before using them.

JohelEGP commented 1 year ago

I did not find an example for the t2::ptr2() case, but I will look more.

The std::swap it's implemented in terms of can be an example. One way to implement the no-throw guarantee is

vector<T> new_v;
try { /*fill new_v*/; } catch(...) /*...*/
this->swap(new_v);
// `new_v` unused, even though the signature could be, in C++2, `friend swap(inout, inout)`.

That said, I don't think these are examples of where a function author might want to make the parameters' out-part ignorable. Whether any argument's out-part is ignored better remains explicit on the call site.

filipsajdak commented 1 year ago

Thanks to this discussion and that I have write more cpp2 code I have noticed that UFCS disables [[nodiscard]] -> #305

hsutter commented 1 year ago

Another example things brings to mind, from my SQL and ODBC days, is SUCCESS_WITH_INFO.

This is a return value that means "the operation succeeded, and by the way there's additional information available in case you want to look." Sometimes it meant this was was a warning rather than an error so you should still look, but sometimes it was really just advisory extra information that happened to cost nothing extra to compute so it was made available to the caller.

If you got that SUCCESS_WITH_INFO return value, you would additionally look elsewhere, at at an inout/out parameter or call a second function, to get the additional information. In the use cases I was familiar with, generally you didn't need to look, and it was considered optional to look.

If that API were:

ODBC32::RETCODE my_function( /*...*/, AdditionalInfo& info);

and you called it like this:

if (my_function( /*...*/, info) != ODBC32::ERROR) {
    // do stuff, but don't necessarily look at info
}

that would generally be fine (IIRC) even though we don't look at the extra info.

So that info parameter would be a can_ignore_output inout.

But it has been 25 years since I've done ODBC programming... and I tend to agree with @filipsajdak and others that a simpler option to try first could be to allow only the call-side opt-out for now, and then see whether there's demand for a discardable return value or discardable out parameter result. (Well, we know there are cases, certainly for discardable return values, but the question is whether it's necessary to declare them as such, or whether they're infrequent enough that the level of noisiness at the call sites is acceptable.)

Still thinking it through... thanks for all the comments.

hsutter commented 1 year ago

BTW the main difference between a return value and an inout/out parameter is that the former is "callee-allocated out" (the callee creates the object and passes it back) and the latter is "caller-allocated out" (the caller provides an existing object that has storage, possibly initialized, to write to). But both are equally "out". Just sharing that viewpoint because maybe it will help explain why I treat them as equivalent for "out" data-flow purposes.

JohelEGP commented 1 year ago

Another example things brings to mind, from my SQL and ODBC days, is SUCCESS_WITH_INFO.

This is a return value that means "the operation succeeded, and by the way there's additional information available in case you want to look." Sometimes it meant this was was a warning rather than an error so you should still look, but sometimes it was really just advisory extra information that happened to cost nothing extra to compute so it was made available to the caller.

That seems like the std::ranges result types, but split between the result and an inout parameter.

With Cpp2, a better API for SUCCESS_WITH_INFO would be return types to aggregate the incidental computations, and non-discardable inout/out parameters for must-inspect results. Aggregating computations in the result relieves callers from having to allocate out.

Arguably, an API that forces the inspection of must-inspect results, without the possibility of passing discarded out arguments, is also possible. A bit unconventional, but by requiring a callable with parameters for the must-inspect results, the caller is also relieved from having to allocate out those for arguments. EDIT: This probably doesn't mix well with coroutines or anything that becomes harder due to the introduction of a different function body for the continuation. In those cases, "throwing values" can do better, if it's a feasible feature for a particular case.

I just had a bit of time in my hands. This might not be relevant.

filipsajdak commented 1 year ago

After sleeping that over, I understand what worries me the most.

In cpp1, there is no way to differentiate out and inout passing styles. That means we mix things up when bringing cpp1 examples to the picture. I have a feeling that most of the examples were examples that in cpp2 match more out passing style than inout.

Examples of `out` & `inout` passing styles that might use ignore

I was thinking about good examples of these cases. Those are excellent examples we could have with cpp1 streams:

fun1: (  out o : std::ostream ) = { /*...*/} // function just write to the stream
fun2: (   in o : std::istream ) = { /*...*/} // function just read from the stream
fun3: (inout o : std::iostream) = { /*...*/} // function read and write to the stream

The above functions can be called with file streams the output result is observable on the filesystem. In some cases, out or inout results can be ignored, but I have doubts if marking the function that you can ignore the results by default will be a good solution. Still, you should check the stream's state at the end.

The only scenario that I imagine you might safely mark an out or inout argument with can_ignore_output (in the case of streams) is when your function uses some error handling that will cover all error cases: e.g., exceptions.

A side note

Currently, the above code will only work as some read & write methods are non-const as they might change the object's state. operator<< or operator>> accept stream by non-const reference - as there is no possibility to express out, in, or inout that was the only option.

What I would like to express in the code is:

When I use out, the function can only write to the argument,
When I use in, the function can only read from the argument,
When I use inout, the function can read and write to the argument,

Unfortunately, there are methods like good() that function shall use despite passing style (the issue is with out assuming that it means no read from variable). Maybe there should be a way to express that a specific method can be called when an object is passed with an in (default), out, or inout passing style. Similarly, we marked methods with const or mutable in cpp1.

Another example of `inout` args use cases

I was looking for examples where I am passing something to function, and I read from it and expect results in the same object. There are some cases:

modifying a string object,
using context object,
streams,
databases,
shared buffers,
etc.

I am struggling with convincing myself if and when it is safe to use can_ignore_output on functions that take them as the argument.

Example of `can_ignore_output out arg` in cpp1

I was looking for some example of out example, and I found one that is at the same time can_ignore_output out:

int QString::toInt(bool *ok = nullptr, int base = 10) const

https://doc.qt.io/qt-6/qstring.html#toInt

ok is an out argument that is optional - you opt out by passing nullptr. This is an excellent example of how we deal with the discussed topic in cpp1.

TLDR

There are use cases when you might want to mark an out or inout argument with z can_ignore_output (e.g., by using exceptions) - std::fstream and std::ofstream are good examples of types that might need that.

I would look for a more strict way of declaring argument as in, out, and inout with a possibility to mark methods in my UDT as allowed to be called in specific passing style context.

gregmarr commented 1 year ago

fun1: (  out o : std::ostream ) = { /*...*/} // function just write to the stream
fun2: (   in o : std::istream ) = { /*...*/} // function just read from the stream
fun3: (inout o : std::iostream) = { /*...*/} // function read and write to the stream

~I think this is conflating two totally orthogonal ideas, the ability to read from or write to the stream, which is identified by the type:~

fun1: (o : std::ostream ) = { /*...*/} // function just write to the stream
fun2: (o : std::istream ) = { /*...*/} // function just read from the stream
fun3: (o : std::iostream) = { /*...*/} // function read and write to the stream

~and the ability to create the objects themselves:~

fun1: (  out o : std::ostream) = { /*...*/} // function creates the ostream
fun2: (   in o : std::ostream) = { /*...*/} // function uses the pre-created ostream
fun3: (inout o : std::ostream) = { /*...*/} // function can use the pre-created ostream if it exists and create it and pass it back to the caller if it doesn't

I'm not sure I'm actually interpreting these modifiers properly, need to go back to the parameter passing papers and look at these again.

Update: I think I've been in C# land too much recently, and was seeing these as pointers rather than objects. I watched the 2020 presentation and I think I have my head back on straight again.

JohelEGP commented 1 year ago

I'm not sure I'm actually interpreting these modifiers properly, need to go back to the parameter passing papers and look at these again.

I agree! I'm in dire need of documentation.

Example of can_ignore_output out arg in cpp1

I was looking for some example of out example, and I found one that is at the same time can_ignore_output out:
int QString::toInt(bool *ok = nullptr, int base = 10) const
https://doc.qt.io/qt-6/qstring.html#toInt

ok is an out argument that is optional - you opt out by passing nullptr. This is an excellent example of how we deal with the discussed topic in cpp1.

out parameters can also accept uninitialized arguments. All the function wants to do is read ok, and if it's not null, write to ok*. So ok should be in or inout. I think in would suffice; I guess writing to ok* doesn't require ok to be inout.

filipsajdak commented 1 year ago

@JohelEGP I presented the toInt() method as an example of how we deal with ignorable output arguments in cpp1.

I believe in cpp2 we could write it as:

QString: type = {

  toInt: (this, can_ignore_output out ok: bool) -> int = {
    //...
  }

}

Please note that ok is not a cpp2 pointer (it will become one on the cpp1 side after cppfront generation).

JohelEGP commented 1 year ago

Hmm... What about the part in C++1 that ok defaults to nullptr? That wasn't translated in the C++2 rewrite. In that C++2 interface, an argument is required for ok, whereas it can be omitted in the C++1 version.

filipsajdak commented 1 year ago

It depends on how you generate the code for can_ignore_output out

JohelEGP commented 1 year ago

I'd expect can_ignore_output to be orthogonal to defaulting parameters. I think the translated interface would look more like toInt: (this, can_ignore_output in ok: *bool = nullptr) -> int, if that's how defaulting parameters work. As you see, ok isn't out or inout. The output is trough ok* if ok is not null. And parameter styles and can_ignore_output apply to the (type of the) parameter.

This is similar to issues about pointers being shallow const.

hsutter commented 1 year ago

Thanks Filip,

I have a feeling that most of the examples were examples that in cpp2 match more out passing style than inout.

And those should rarely want to allow ignoring the result, because out can be a constructor... and ignoring the result would mean asking for construction and then not using the object, and relying only on the side effects.

Except for an RAII object, where the intended use is to construct the object as an automatic storage duration object and typically never interact with it, since its purpose is to run its destructor at the end of scope. So this would be totally fine:

{
    // let's say this guard is declared uninitialized because we might need to
    // initialize it later...
    guard: my_raii;
    // ... say because we have to decide which alternate constructor to use...
    if something() {
        guard = create_guard( options, here );
    }
    else {
        initialization_function( out guard );  // note it's okay to ignore this particular 'out' parameter
    }

    // ... long function body with early returns etc. ...

}  // ~guard executed on all paths

So perhaps that's another good example of ignoring the output value (until the dtor)?

What I would like to express in the code is:

When I use out, the function can only write to the argument,

For out I'd say "the function must write to the argument using = before any other use'... later reading the value that the function itself wrote is totally fine, and may be needed to do multi-step initialization of it before finally handing it back to the caller in the desired state.

When I use in, the function can only read from the argument,

When I use inout, the function can read and write to the argument,

Those two are the status quo, right?

filipsajdak commented 1 year ago

RAII objects are an excellent example! This is also an example of an out variable, but there is also an inout case:

{
    guard: my_raii;
    if something() {
        guard = create_guard( options, here );
    }
    else {
        initialization_function( out guard );
    }
    use_and_modify( guard ); // note it's okay to ignore this particular 'inout' parameter

}  // ~guard executed on all paths

And I also realize that this is a more generic example than those with standard streams (they are done in the RAII way).

I will write more later during the day.

gregmarr commented 1 year ago

While we're discussing what the modifiers are supposed to express, in the 708 repo someone mentioned diagnosing "I said inout but I only wrote to it" and saying that it should recommend changing to out.

filipsajdak commented 1 year ago

Yes, also the diagnostics will be provided to use in when using inout and only reading from the argument.

filipsajdak commented 1 year ago

I have mixed feelings about the out meaning. Currently, it means write to but also initialize first.

There is a quote in 708:

“Finally, the [internal January 1981] memo introduces writeonly: ‘There is the type operator writeonly, which is used like readonly, but prevents reading rather than writing...’ “ —B. Stroustrup (D&E, p. 90)

And I missed exactly that. I would like to express that function will only write to the argument.

Use cases:

std::ostream
write only memory,
pipes,

And there could be diagnostics: suggesting using inout when you used out but you also read from the argument.

The current meaning of out is rather “initialize before reading or writing”.

In case of passing UDT as out, as far as I understand it correctly, you need to first call a method that has out this before you will be able to call any other method or do anything else with the object, right?

So, maybe I am wrong. Maybe I just need to define methods in my UDT as out this to be able to use my type as out argument?

E.g., for ostream I can define methods for writing to stream as out, right? But doesn't that change the meaning of the out this in methods?

gregmarr commented 1 year ago

Musing a bit more about these modifiers based on preconditions and what you must and must not do. Are these descriptions accurate, sufficient?

in:

object must be initialized before the call
must read from it in at least one code path in a non-virtual function, or it's diagnosed as unused.
- primitive: is a const value
- udt: can call in this functions or read data members

out:

object should not be initialized before the call
must be initialized before any other use
- primitive: must assign the value
- udt: must assign or call an out this function or write all data members
must initialize it in at least one code path if in a non-virtual function, or it's diagnosed as not out

inout:

object must be initialized before the call
must read from it in at least one code path in a non-virtual function, or it's diagnosed as should be out
- primitive: must read the value
- udt: must call an in this or inout this function or read a data member
must write to it in at least one code path if in a non-virtual function, or it's diagnosed as should be in.
- primitive: must write the value
- udt: must call an out this or inout this function or write a data member

filipsajdak commented 1 year ago

I see also a lack of consistency (or symmetry):

On in argument we can call in this methods.
On inout argument we can call in this, out this, and inout this methods.
On out argument we need to first call out this method and after that we can call methods:
1. in this,
2. out this, and
3. inout this.

Maybe, what we have in the third point is better expressed with the following:

fun: (first_init in x) = {} // case 3.1
fun: (first_init out x) = {} // case 3.2
fun: (first_init inout x) = {} // case 3.3

Then we can have a rule that on the out argument you can call only out methods.

gregmarr commented 1 year ago

Maybe, what we have in the third point is better expressed with the following:

I deleted this from my message above, but I originally had: in: can only call in this functions. out: must initialize and then you can do whatever you want. inout: you can do whatever you want.

I don't see a benefit to out meaning "you can only call out this". I imagine that the number of objects that are truly non-readable is almost non-existent. Even std::ostream has bool good() const.

JohelEGP commented 1 year ago

In case of passing UDT as out, as far as I understand it correctly, you need to first call a method that has out this before you will be able to call any other method or do anything else with the object, right?

That's right. Usually, that'd be operator=.

The current meaning of out is rather “initialize before reading or writing”.

Not really "initialize". It can be already initialized. You can pass initialized arguments to out parameters.

E.g., for ostream I can define methods for writing to stream as out, right? But doesn't that change the meaning of the out this in methods?

Since out parameters can be uninitialized, what'd you do if you were given an uninitialized std::ostream?

And I missed exactly that. I would like to express that function will only write to the argument.

So, maybe I am wrong. Maybe I just need to define methods in my UDT as out this to be able to use my type as out argument?

I think you want a metaclass write_only_reference_wrapper that only permits in and inout uses of the wrapped object.

And there could be diagnostics: suggesting using inout when you used out but you also read from the argument.

IIRC, it's supposed to diagnose to first apply = to the out parameter, or pass it as an out argument, before reading from it. Since not all object parameters of =s are out, that's not exactly right.

Use cases:

* `std::ostream`

* write only memory,

* pipes,

Your previous example of standard streams included reading the error status of a stream. You wouldn't be able to do that with writeonly. With memory, you wouldn't be able to query its space left.

Arguably, with sufficient indirections, writeonly could be handy. With memory, you can pass it to a writeonly that does the write part once the caller, who has inout access, has asserted having enough capacity. Streaming operator<< >> are not such an example. Those build sentinels which require in or inout access.

Then we can have a rule that on the out argument you can call only out methods.

That's not as useful as it can be. Remember:

For out I'd say "the function must write to the argument using = before any other use'... later reading the value that the function itself wrote is totally fine, and may be needed to do multi-step initialization of it before finally handing it back to the caller in the desired state. -- https://github.com/hsutter/cppfront/issues/231#issuecomment-1489600460

I think you want a metaclass write_only_reference_wrapper that only permits in and inout uses of the wrapped object.

Actually, a metaclass or writeonly parameter passing wouldn't work without C++1 reflection. There's just no way to prohibit const uses of a type or object.

hsutter / cppfront

[BUG] move from last use break code where variable is passed to function with `inout` argument passing #231

Expectations

tldr

Why a feature: Diagnosing an unused output, like `[[nodiscard]]`

Example 1: Just `return`

Example 2: Also modify parameter

Naming the opt-out

Scenario 1 - `ignore_output` on the call side.

Scenario 2 - `can_ignore_output` on the function definition side

Summary

Examples of `out` & `inout` passing styles that might use ignore

A side note

Another example of `inout` args use cases

Example of `can_ignore_output out arg` in cpp1

TLDR

Example of `can_ignore_output out arg` in cpp1

hsutter / cppfront

[BUG] move from last use break code where variable is passed to function with `inout` argument passing #231

Expectations

tldr

Why a feature: Diagnosing an unused output, like [[nodiscard]]

Example 1: Just return

Example 2: Also modify parameter

Naming the opt-out

Scenario 1 - ignore_output on the call side.

Scenario 2 - can_ignore_output on the function definition side

Summary

Examples of out & inout passing styles that might use ignore

A side note

Another example of inout args use cases

Example of can_ignore_output out arg in cpp1

TLDR

Example of can_ignore_output out arg in cpp1

Why a feature: Diagnosing an unused output, like `[[nodiscard]]`

Example 1: Just `return`

Scenario 1 - `ignore_output` on the call side.

Scenario 2 - `can_ignore_output` on the function definition side

Examples of `out` & `inout` passing styles that might use ignore

Another example of `inout` args use cases

Example of `can_ignore_output out arg` in cpp1

Example of `can_ignore_output out arg` in cpp1