Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.93k stars 552 forks source link

aliasing in signatures #16259

Open p5pRT opened 6 years ago

p5pRT commented 6 years ago

Migrated from rt.perl.org#132472 (status was 'open')

Searchable as RT132472$

p5pRT commented 6 years ago

From zefram@fysh.org

Created by zefram@fysh.org

When we first implemented signatures\, the requirement that they involve no vapourware meant that they had to perform only copying of arguments\, not aliasing. Since then\, aliasing has been added to the language\, albeit only experimentally so far. It would be good for signatures to have the option to alias arguments. They should continue to default to copying\, to avoid surprising the programmer with the more unusual aliasing behaviour. Aliasing should be determined separately for each parameter.

While scalar parameters can be aliased directly to argument scalars\, there is no equivalent operation that would alias an array or hash parameter. There is no array or hash object to alias\, only a list of scalars. However\, an array or hash parameter corresponding to a sequence of arguments *can* involve aliasing​: an array's elements can be aliased to argument scalars\, and a hash's values (but not keys) can be aliased to argument scalars.

Another operation that is meaningful and might be included here is aliasing a parameter variable to the referent of a reference argument. There is synergy here with prototypes such as (\@​)\, which encourage the passing of references.

An obvious approach to both kinds of aliasing is to incorporate the reference operator "\" into signatures. This can imitate to some extent the refaliasing syntax\, and careful placement of the "\" can help the programmer to think about which reference quantities are being equated\, to minimise confusion.

An obvious syntax to start with is "(\$x)"\, but we must be careful about which operation it shoud refer to. Considering arrays and hashes\, the matching syntax "(\@​x)" clearly refers to aliasing the entire array variable (and likewise for hashes)\, so could only mean that that signature item takes a single argument\, which must be an array reference\, and the array parameter variable is aliased to the referent of the argument. For consistency\, then\, "(\$x)" must alias $x to the referent of a scalar ref argument. So (ignoring the issue of experiment warnings and the wording of argument count diagnostics)

  use feature "signatures";   sub clear_referent (\$x) { $x = undef; }

would behave as

  sub clear_referent {   die unless @​_ == 1;   {   use feature "refaliasing";   \(my $x) = $_[0];   }   $x = undef;   }

A runtime error should occur if the argument supplied for this kind of parameter is not the expected kind of reference.

If this kind of reference argument is optional\, aliasing can still be done\, but then the default value expression should be providing a default *reference* value\, with the parameter variable aliasing to the referent. That is\, the default value expression provides a default argument value\, as it does for ordinary optional parameters. For example\, in "(\@​x = [])" @​x would by default be an empty array. "(\@​x = (3))" would cause a runtime error in the default case\, because 3 is not a reference. Note that it is meaningful for a "\@​x" array or "\%x" hash parameter to have a default value\, because it corresponds to exactly one scalar argument\, whereas it is not meaningful to have a default value for a regular "@​x" or "%x" that would correspond to a sequence of arguments.

If this kind of parameter is unnamed\, the argument type check should still apply. So "(\@​)" will check that the argument is an array ref\, and signal an error if it is not\, but will throw away the argument if it is of the correct kind. "(\@​ = ...)" should still evaluate the expression in the default case\, and error if the value is not an array reference\, but should otherwise throw the value away. "(\@​ =)" should be legal just as "($ =)" is​: it should perform the reference type check on a supplied argument\, and should do nothing (with no error) for an omitted argument.

Now\, aliasing to the argument scalars themselves. In terms of refaliasing syntax\, the operation for a scalar parameter corresponds to putting a reference operator on both sides of the notional assignment operator. There is no equivalent for arrays aliasing their elements or hashes aliasing their values\, and since we don't actually have an assignment operator in the signature in the usual case we can't so easily place the reference operator to imitate aliasing assignment. I suggest instead postfixing the "\" to the parameter name\, hinting at a notional "\=" aliasing operator. So (ignoring the issue of experiment warnings and the wording of argument count diagnostics)

  use feature "signatures";   sub clear ($x\) { $x = undef; }

would behave as

  sub clear {   die unless @​_ == 1;   {   use feature "refaliasing";   \(my $x) = \$_[0];   }   $x = undef;   }

With array and hash parameters we'd have "(@​x\)" aliasing array elements and "(%x\)" aliasing hash values.

If an argument is optional\, following the logic of the notional "\=" operator\, the default expression still provides a default argument\, but now it's not just a default value\, it's a default *lvalue*\, which will be referenced. For example\, in "($x \= 3)" $x would by default have a value of 3\, and would be unwritable\, being an alias to the read-only 3. It's not meaningful to have an array or hash parameter that does this kind of aliasing be optional​: there's nothing defective about having zero arguments to put into an array or hash.

This kind of aliasing is mostly a no-op on unnamed parameters. "($\)" should behave the same as "($)"\, and "(@​\)" the same as "(@​)". "($ \= ...)" should still evaluate the expression in the default case\, for side effects\, just as "($ = ...)" does. "($ \=)" should be legal just as "($ =)" is\, producing no error in the default case.

It's also necessary to think about how aliasing would interact with smartmatch type constraints\, or any other kind of type constraints that get added to the signature language.

Perl Info ``` Flags: category=core severity=wishlist Site configuration information for perl 5.27.5: Configured by zefram at Fri Oct 20 23:24:00 BST 2017. Summary of my perl5 (revision 5 version 27 subversion 5) configuration: Platform: osname=linux osvers=3.16.0-4-amd64 archname=x86_64-linux-thread-multi uname='linux barba.rous.org 3.16.0-4-amd64 #1 smp debian 3.16.43-2+deb8u2 (2017-06-26) x86_64 gnulinux ' config_args='-des -Dprefix=/home/zefram/usr/perl/perl_install/perl-5.27.5-i64-f52 -Duselargefiles -Dusethreads -Uafs -Ud_csh -Uusesfio -Uusenm -Duseshrplib -Dusedevel -Uversiononly -Ui_db' hint=recommended useposix=true d_sigaction=define useithreads=define usemultiplicity=define use64bitint=define use64bitall=define uselongdouble=undef usemymalloc=n default_inc_excludes_dot=define bincompat5005=undef Compiler: cc='cc' ccflags ='-D_REENTRANT -D_GNU_SOURCE -fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2' optimize='-O2' cppflags='-D_REENTRANT -D_GNU_SOURCE -fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include' ccversion='' gccversion='4.9.2' gccosandvers='' intsize=4 longsize=8 ptrsize=8 doublesize=8 byteorder=12345678 doublekind=3 d_longlong=define longlongsize=8 d_longdbl=define longdblsize=16 longdblkind=3 ivtype='long' ivsize=8 nvtype='double' nvsize=8 Off_t='off_t' lseeksize=8 alignbytes=8 prototype=define Linker and Libraries: ld='cc' ldflags =' -fstack-protector-strong -L/usr/local/lib' libpth=/usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/4.9/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib libs=-lpthread -lnsl -ldb -ldl -lm -lcrypt -lutil -lc perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc libc=libc-2.19.so so=so useshrplib=true libperl=libperl.so gnulibc_version='2.19' Dynamic Linking: dlsrc=dl_dlopen.xs dlext=so d_dlsymun=undef ccdlflags='-Wl,-E -Wl,-rpath,/home/zefram/usr/perl/perl_install/perl-5.27.5-i64-f52/lib/5.27.5/x86_64-linux-thread-multi/CORE' cccdlflags='-fPIC' lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector-strong' @INC for perl 5.27.5: /home/zefram/usr/perl/perl_install/perl-5.27.5-i64-f52/lib/site_perl/5.27.5/x86_64-linux-thread-multi /home/zefram/usr/perl/perl_install/perl-5.27.5-i64-f52/lib/site_perl/5.27.5 /home/zefram/usr/perl/perl_install/perl-5.27.5-i64-f52/lib/5.27.5/x86_64-linux-thread-multi /home/zefram/usr/perl/perl_install/perl-5.27.5-i64-f52/lib/5.27.5 Environment for perl 5.27.5: HOME=/home/zefram LANG (unset) LANGUAGE (unset) LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=/home/zefram/usr/perl/perl_install/perl-5.27.5-i64-f52/bin:/home/zefram/usr/perl/util:/home/zefram/pub/x86_64-unknown-linux-gnu/bin:/home/zefram/pub/common/bin:/usr/bin:/bin:/usr/local/bin:/usr/games PERL_BADLANG (unset) SHELL=/usr/bin/zsh ```
p5pRT commented 6 years ago

From @ilmari

Zefram (via RT) \perlbug\-followup@​perl\.org writes​:

When we first implemented signatures\, the requirement that they involve no vapourware meant that they had to perform only copying of arguments\, not aliasing. Since then\, aliasing has been added to the language\, albeit only experimentally so far. It would be good for signatures to have the option to alias arguments. They should continue to default to copying\, to avoid surprising the programmer with the more unusual aliasing behaviour. Aliasing should be determined separately for each parameter.

[snip details]

I've implemented this part on the smoke-me/ilmari/signature-refaliasing refaliasing branch\, and meant to start a discussion about it\, but ran out of tuits.

Now\, aliasing to the argument scalars themselves. In terms of refaliasing syntax\, the operation for a scalar parameter corresponds to putting a reference operator on both sides of the notional assignment operator. There is no equivalent for arrays aliasing their elements or hashes aliasing their values\, and since we don't actually have an assignment operator in the signature in the usual case we can't so easily place the reference operator to imitate aliasing assignment. I suggest instead postfixing the "\" to the parameter name\, hinting at a notional "\=" aliasing operator. So (ignoring the issue of experiment warnings and the wording of argument count diagnostics)

I did consider the case of aliasing the actual passed value instead of its referent\, but not long enough to come up with any syntax for it.

With array and hash parameters we'd have "(@​x\)" aliasing array elements and "(%x\)" aliasing hash values.

If an argument is optional\, following the logic of the notional "\=" operator\, the default expression still provides a default argument\, but now it's not just a default value\, it's a default *lvalue*\, which will be referenced. For example\, in "($x \= 3)" $x would by default have a value of 3\, and would be unwritable\, being an alias to the read-only 3. It's not meaningful to have an array or hash parameter that does this kind of aliasing be optional​: there's nothing defective about having zero arguments to put into an array or hash.

Could this \= operator be generalised to work outside signatures\, by introducing an actual \= op?

Then

  $foo \= $bar;

would be equivalent to

  \$foo = \$bar;

and

  @​foo \= @​bar;

would alias each element of @​foo to the corresponding element of @​bar (which I don't think that can be done with the current refaliasing syntax).

- ilmari -- "A disappointingly low fraction of the human race is\, at any given time\, on fire." - Stig Sandbeck Mathisen

p5pRT commented 6 years ago

The RT System itself - Status changed from 'new' to 'open'

p5pRT commented 6 years ago

From zefram@fysh.org

Dagfinn Ilmari Mannsaker wrote​:

Could this \= operator be generalised to work outside signatures\, by introducing an actual \= op?

That's possible\, yes. We'd have to think more about the spelling\, but I think an actual "\=" operator is considerably less ugly than the postfix "\" in the signature.

@​foo \= @​bar;

would alias each element of @​foo to the corresponding element of @​bar (which I don't think that can be done with the current refaliasing syntax).

It *can* be done with current refaliasing​:

  \(@​foo) = \(@​bar);

but the notional

  %foo \= ...;

can't be done with any single refaliasing assignment. (You can iterate over the list yourself\, aliasing one value at a time.)

-zefram