referenced constant loses readonlyness

p5pRT commented 12 years ago

Migrated from rt.perl.org#109744 (status was 'open')

Searchable as RT109744$

p5pRT commented 11 years ago

From @cpansprout

On Fri Jul 26 00:28:04 2013\, sprout wrote:

On Sun Jun 16 06:48:26 2013\, sprout wrote:

diff --git a/pod/perlsub.pod b/pod/perlsub.pod index 027d7be..7fff252 100644 --- a/pod/perlsub.pod +++ b/pod/perlsub.pod @@ -1365\,7 +1365\,7 @@ starts scribbling on your C\<@_> parameter list. Ouch! This is all very powerful\, of course\, and should be used only in moderation to make the world a better place.

-=head2 Constant Functions +=head2 Inlinable Functions X\ etc.

This is the one unresolved issue left in this ticket. If I apply the patch\, I break the link in constant.pm’s documentation. If I change constant.pm at the same time\, I break its link when the next CPAN release is installed in 5.18 or earlier.

I just applied the change to the paragraph\, without the header change\, in commit e4fde5ca795.

--

Father Chrysostomos

p5pRT commented 11 years ago

From zefram@fysh.org

Father Chrysostomos via RT wrote:

If $a+$b returns a value\, not a variable\, then we could say that \ and for(...) and func(...) impose "variable context".

The usual term for this is "lvalue context". A variable (that can vary) is an lvalue. A read-only scalar is also an lvalue.

It is instructive to compare against C. The C equivalent of Perl's \ operator is &. C's & operator requires that the operand be an lvalue (but not necessarily mutable)\, and it is a compile-time error to apply it to a non-lvalue. Perl's pass-by-reference semantics would conflict with any attempt to prevent enreferencement of non-lvalues.

Are you going to document \'s variable-generating semantics?

-zefram

p5pRT commented 11 years ago

From @cpansprout

On Wed Jul 31 02:42:52 2013\, zefram@fysh.org wrote:

Father Chrysostomos via RT wrote:

If $a+$b returns a value\, not a variable\, then we could say that \ and for(...) and func(...) impose "variable context".

The usual term for this is "lvalue context". A variable (that can vary) is an lvalue. A read-only scalar is also an lvalue.

It is instructive to compare against C. The C equivalent of Perl's \ operator is &. C's & operator requires that the operand be an lvalue (but not necessarily mutable)\, and it is a compile-time error to apply it to a non-lvalue. Perl's pass-by-reference semantics would conflict with any attempt to prevent enreferencement of non-lvalues.

Are you going to document \'s variable-generating semantics?

I’m not sure how to go about that\, nor do I think it is necessary. I always assumed that $a+$b would return a new mutable scalar\, and that \($a+$b) just references a scalar that would otherwise have been short-lived. Nothing in the observable behaviour contradicts that view.

This bare value vs variable distinction is not something that is mentioned anywhere in the Perl documentation. Is it something we want to document rigidly\, or will it just add to the mental burden? Most of the time it makes no difference.

In we want to put it anywhere\, it should go under the documentation for constants\, wherever that might be\, since that is where it actually matters. Apparently we don’t define the term ‘constant’ clearly anywhere.

One thing that makes this difficult is that constants are currently inconsistent. Making constants created by overload::constant and constant.pm consistently return read-only scalars broke CPAN modules.

At this stage\, I am willing to leave things inconsistent\, as the changes I have made so far have allowed me to fix the bugs I wanted to fix.

--

Father Chrysostomos

p5pRT commented 11 years ago

From zefram@fysh.org

Father Chrysostomos via RT wrote:

I'm not sure how to go about that\, nor do I think it is necessary. I always assumed that $a+$b would return a new mutable scalar\, and that \($a+$b) just references a scalar that would otherwise have been short-lived. Nothing in the observable behaviour contradicts that view.

If the behaviour is consistently thus\, then we could document a general principle that certain classes of operator generate new variables. However\, it's actually not a consistent behaviour of the addition operator\, because if the operands are sufficiently constant then it gets constant folded and generates a read-only lvalue instead. Related inconsistency is where this bug report started.

I think it is vitally important that we document which situations create new variables\, because it makes a real difference to the semantics of reasonable programs. If generating variables is a feature of the language\, then it is reasonable to use those variables by storing new values in them. If we have two references to variables\, the program's behaviour is likely to depend a great deal on whether they're distinct variables or two references to the same variable. The documentation should provide enough information for someone using these variables to determine which ones will be distinct. Also\, since addition doesn't *always* generate a variable\, the documentation must distinguish the variable-generating cases from the read-only-lvalue cases.

This bare value vs variable distinction is not something that is mentioned anywhere in the Perl documentation.

It's not necessary to cast the documentation in those terms. It's a useful abstraction in language design\, but since Perl doesn't reify the distinction it's not necessarily useful in documenting Perl. In Perl the main concept is the scalar (or the broader concept of scalar/array/hash/glob/etc.)\, which can always be referenced in an lvalue way\, so we never deal with values in isolation. The important distinction is between variable and read-only scalars\, and among variables the important feature is which operation created them (so that we can determine which ones are separate storage locations).

At this stage\, I am willing to leave things inconsistent\, as the changes I have made so far have allowed me to fix the bugs I wanted to fix.

OK\, but we do need to document the behaviour. Especially so if it's a feature that we want to keep\, but if not then we should at least say something like "the scalar returned by this operation may be either read-only or variable\, and this may change in the future".

-zefram

p5pRT commented 11 years ago

From @demerphq

On 26 July 2013 18:16\, Zefram \zefram@fysh\.org wrote:

Father Chrysostomos via RT wrote:

I'm afraid the second one sounds stupid because I am biased against it and know not how to express it convincingly.

The position to which you refer comes from a conceptual distinction between variables and values. A value is (conceptually) inherently immutable. A variable is a storage location that contains a value\, and is mutable in that it can contain different values at different times. Two variables that presently contain the same value are functionally distinguishable because one can write a new value to one of them and observe that they now contain different values.

The position\, then\, is an instance of Occam's razor: one should not gratuitously generate variables. A non-lvalue expression\, such as $a+1\, conceptually yields a value\, not a variable. As the Perl language allows the refgen operator to be applied to this expression\, inevitably we can get this value into an lvalue situation and try assigning to it. Applying Occam's razor\, this process should not have generated a variable\, and so assignment must fail. If assignment is not to fail\, then we have created a variable somewhere\, and whichever operator did that ought to be documented as having that effect. In the case of \($a+1)\, apparently either the addition or the refgen operator is creating variables\, either of which is somewhat surprising.

There is a certain amount of difficulty in applying this idea to Perl\, in that Perl has historically been very weak on the distinction between variables and values. Our SV structure serves both purposes: we can't have a pure value without the variability-supporting wrapper. The closest we get to a pure value is an SV with the read-only flag set; this is a good enough implementation for analytical purposes\, but it's really an abstraction inversion. The result of values always coming in the structure of a variable\, and the read-only flag requiring extra effort to turn on\, is that all sorts of things in Perl implicitly create variables\, and many data structures have inherent mutability that's difficult to avoid in the rather common case where it's unwanted.

I think this state\, of almost everything being mutable by default\, is a natural consequence of developing a language from modest beginnings. A very dynamic approach to an interpreter yields quick wins\, and particularly lets you introspect quite easily. Unfortunately the dynamic approach makes it a pain to analyse programs\, so it massively impedes compilation\, optimisation\, correctness proving\, and other such things that one wants to do with large programs. Each variable that the programmer didn't actually want to be variable is an obstruction to proving that data flows in the way the programmer relied on.

I recommend Henry Baker's paper "Equal Rights for Functional Objects" \<http://www.pipeline.com/~hbaker1/ObjectIdentity.ps.gz> for an examination of issues arising from the question of variability.

The end result of applying Occam's razor to the existence of variables is a language where everything is read-only by default: you only get variables where explicitly requested. It's obviously not feasible to turn Perl into this sort of language. But we'd probably have a better language if we avoided creating variables as much as possible. My comment above about documentation was serious: if we're going to implicitly create variables\, the programmer ought to be able to rely on the semantics of these variables\, and so the programmer needs to know which operations create them. When the program acquires references to multiple variables\, it's vitally important to know which of the variables are distinct\, and which are multiple references to the same variable. If the variable-creation semantics turn out to be confusing\, well\, it's a lot easier to document that an operation doesn't produce variables at all.

I just wanted to thank you for this post. IMO it could be the start of a very useful new perldoc pod file.

cheers\, Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 11 years ago

From @cpansprout

On Wed Jul 31 09:17:27 2013\, zefram@fysh.org wrote:

Father Chrysostomos via RT wrote:

I'm not sure how to go about that\, nor do I think it is necessary. I always assumed that $a+$b would return a new mutable scalar\, and that \($a+$b) just references a scalar that would otherwise have been short-lived. Nothing in the observable behaviour contradicts that view.

If the behaviour is consistently thus\, then we could document a general principle that certain classes of operator generate new variables. However\, it's actually not a consistent behaviour of the addition operator\, because if the operands are sufficiently constant then it gets constant folded and generates a read-only lvalue instead. Related inconsistency is where this bug report started.

Actually\, I fixed that in commit 2484f8dbbb.

I think it is vitally important that we document which situations create new variables\, because it makes a real difference to the semantics of reasonable programs. If generating variables is a feature of the language\, then it is reasonable to use those variables by storing new values in them. If we have two references to variables\, the program's behaviour is likely to depend a great deal on whether they're distinct variables or two references to the same variable. The documentation should provide enough information for someone using these variables to determine which ones will be distinct. Also\, since addition doesn't *always* generate a variable\, the documentation must distinguish the variable-generating cases from the read-only-lvalue cases.

This bare value vs variable distinction is not something that is mentioned anywhere in the Perl documentation.

It's not necessary to cast the documentation in those terms. It's a useful abstraction in language design\, but since Perl doesn't reify the distinction it's not necessarily useful in documenting Perl. In Perl the main concept is the scalar (or the broader concept of scalar/array/hash/glob/etc.)\, which can always be referenced in an lvalue way\, so we never deal with values in isolation. The important distinction is between variable and read-only scalars\, and among variables the important feature is which operation created them (so that we can determine which ones are separate storage locations).

At this stage\, I am willing to leave things inconsistent\, as the changes I have made so far have allowed me to fix the bugs I wanted to fix.

The only inconsistency left here is in constant.pm\, and I documented it in 842f3911b.

OK\, but we do need to document the behaviour. Especially so if it's a feature that we want to keep\, but if not then we should at least say something like "the scalar returned by this operation may be either read-only or variable\, and this may change in the future".

How does the attached patch look to you?

--

Father Chrysostomos

p5pRT commented 11 years ago

From @cpansprout

Inline Patch

```diff diff --git a/pod/perlfunc.pod b/pod/perlfunc.pod index 4de31ac..390c2df 100644 --- a/pod/perlfunc.pod +++ b/pod/perlfunc.pod @@ -94,6 +94,13 @@ L for the mechanism. If you are using such a module, see the module's documentation for details of the syntax that it defines. +The return values of functions are generally new variables, meaning that +you can take references to them and modify them through those references. +Evaluating the same function call twice (in a loop or subroutine) produces +two different variables. Functions returning true or false generally +return the same two read-only scalars each time, though this is not always +consistent and may change in the future. + =head2 Perl Functions by Category X diff --git a/pod/perlop.pod b/pod/perlop.pod index 4c26fe7..0792710 100644 --- a/pod/perlop.pod +++ b/pod/perlop.pod @@ -21,6 +21,13 @@ repetition or list repetition, depending on the type of the left operand, and C<&>, C<|> and C<^> can be either string or numeric bit operations. +The return values of operators are generally new variables, meaning that +you can take references to them and modify them through those references. +Evaluating the same operator twice (in a loop or subroutine) produces two +different variables. Operators returning true or false generally return +the same two read-only scalars each time, though this is not always +consistent and may change in the future. + =head2 Operator Precedence and Associativity X X X ```

p5pRT commented 11 years ago

From zefram@fysh.org

Father Chrysostomos via RT wrote:

How does the attached patch look to you?

Seems woolly. As there's a localised disclaimer of future changeability\, it seems that the bulk of the new text is guaranteeing something that will never change. But the actual statement isn't a clear guarantee: it says "generally" this happens but allows for exceptions. So I think it'll mislead.

If the intention is to guarantee some behaviour\, state precisely what behaviour is being guaranteed. If the intention is to describe non-guaranteed behaviour\, add explicit disclaimer covering the whole description.

Also\, I'd be wary of widely guaranteeing creation of new variables at this stage. That's an architectural question that needs debate before we start closing off future options.

-zefram

Perl / perl5