Closed p5pRT closed 20 years ago
This is a bug report for perl from "Anton Tagunov" \tagunov@​motor\.ru generated with the help of perlbug 1.33 running under perl v5.7.3.
Hello\, developers! Very few time has passed since I started learning Perl and the docs have helped me well to get up to speed with a new language so far. :-)
But there was one subject that remained completely mysterious to me: the Typeglobs. I was completely stuck at understanding them utill
sthoenna@efn.org (Yitzchak Scott-Thoennes) (bs"d) and Mark-Jason Dominus \mjd@​plover\.com
have really helped me out :-)
I looked at what I got to understand and it seemed to me that the docs sort could probably have more on Typeglobs. Not that this is something so much inevitable nowadays\, that we have hard references and open my $fh\, but still I believe that's some code out there in reality that uses them and newcomers will probably have to understand and debug it. Moreover understanding the Exporter and alike is hardly possible without a clear picture of how the Typeglob machinery works. And finally its just nice to understand how that all works\, at least for historical reasons :-)
The next day I got there mails from Yitzchak and Mark-Jason I tried to rework my understanding into a possibly solid piece of documentation. Here it is! I believe the material is quite raw\, probably I should have written typeglob instead of Typeglob in the middle of senctences. I always tend to be too wordy when I get to explaining things\, so maybe this may be abridged\, maybe some parts are surplus by themselves\, any remarks welcome! The general idea is to make this a fuel for a better description of Typeglobs.
I see a place for this in perlmod.pod just before
=head2 Package Constructors and Destructors
but have no reasons to object against putting it somewhere else. (To a separate perlglobtut.pod?)
With best regards and warmest wishes\, - Anton Tagunov
Don't be mislead by C\<print(*main::foo)> printing C\<'*main::foo'>
and C\<*main::foo eq 'main::foo'> evaluating to C\<1>.
It's best to imagine C\<*main::foo> as an instance of special
complex datatype. Let us use C\
TYPE Typeglob = RECORD SCALAR\, ARRAY\, HASH\, CODE\, IO : REFERENCE; PACKAGE\, NAME : STRING /*immutable*/; END;
Thy following almost equivalent expressions serve as variables strictly typed to Typeglob:
*main::foo
$main::{foo}
$main::{'foo'}
(The first one is always of the Typeglob type\, and the later
may be either L\
As Perl code is compiled and identifiers encourted the C\<\< %\<package-name>:: >> hash is filled with (references to) newly created Typeglobs. The same identifier in the same package never causes a Typeglob to be created twice.
sub foo $foo %foo @foo &foo *foo foo
all trigger Typeglob creation\, while
$main::{foo} $main::{'foo'}
don't.
Typeglobs form a separate scalar data type in Perl and are allowed everywhere regular scalars are allowed. They may become values of scalar variables\, members of complex data structures\, be passed in and out of functions (as parameters and results)\, be assigned to each other and back to the C\<*main::foo>/C\<$main::{foo}> expressions (equivalent expressions are grouped together):
*main::foo = *main::bar ; *main::foo = $main::{bar} ; $main::{foo} = *main::bar ; $main::{foo} = $main::{bar} ;
$a = *main::foo ;
$a = $main::{foo} ;
$c = [ 0\, 1\, *other::baz ];
&examine( *other::baz ) ; &examine( $other::{'baz'} ) ;
=for comment Are there any ways but the natural one and localizing *foo to create Typeglob objects? Probably no\, spell it out here?
All Typeglob assignments are done I\
If you have a expression that evaluates to a Typeglob value you may access its slots via the following notation:
${expr} # SCALAR %{expr} # HASH @{expr} # ARRAY $#{expr # ARRAY\, access the last element index &{expr} # CODE expr # IO handle
The limitations imposed over the expressions are the same as those imposed over expressions evaluating to hard references to be dereferencable via S\<C\<$\, %\, @\, $#\, &>\,> see L\<prelref|perlref>.
# provided that the 'main::foo' identifier has been seen by the # compiler till the current moment *main::foo and $main::{foo} # are strictly equivalent (modulo a slight the performance) # # that's why expressions in each of the following pairs are # equivalent # ${*main::foo} ${$main::{foo}} # same as $foo %{*main::foo} %{$main::{foo}} # same as %foo @{*main::foo} @{$main::{foo}} # same as @foo &{*main::foo} &{$main::{foo}} # same as &foo readline(*main::foo) readline($main::{foo}) # readline(foo)
# an example with a complex data structure $main::foo='foo'; @other::bar=(0\,1\,2\,3); $outer{OU} = [ 14\, { IN => *main::foo }\, $other::{bar} ]; print ${$outer{OU}[1]{IN}}\, $#{$outer{OU}[2]}; #prints foo3
#an example of passing Typeglobs in/out of a function our ($foo\,$bar)=('foo'\,'bar'); sub a{ (*bar\,${shift()}) } my @a=&a(*foo); print ${$a[0]}\, $a[1];
Another notation related to typeglobs is
*foo{CODE} # equivalent to \&foo *foo{SCALAR} # equivalent to \$foo *foo{HASH} # equivalent to \%foo *foo{ARRAY} # equivalent to \@foo *foo{CODE} # equivalent to \&foo *foo{IO} # equivalent to \foo
it allowes to obtain hard references to slots in C\<*main::foo> Typeglob variable. Unlike the previous one this notation does works only with a literal C\<*foo> and is not applicable to general expressions evaluating to Typeglobs. You can partially bypass this limitation and obtain references to the slots of a Typeglob by doing:
# let's assume my $v=\*foo; \${expr} # then \$$v gets \$foo \%{expr} # then \$%v gets \%foo \@{expr} # then \$@v gets \@foo \&{expr} # then \$&v gets \&foo
but there's no workaround to obtain a reference to the C\
Moreover\, the C\<*foo{IO}> notation has issues
covered/to be covered in L\
local *foo;
has the effect of localizing C\<*main::foo>\, the C\<$main::{foo}>'s
optimized shorthand to. A new temporary Typeglob is
created and assigned to C\<*main::foo==$main{foo}>.
All previously made assignments are not affected (see
the samples bellow). Because assiments from
C\<*main::foo==$main::{foo}> operate on references\,
doing something like C\<$save=*main::foo> or C\<return *main::{foo}>
will save a reference to the temporary Typeglob. This will
keep it (and its slots) accessible via the saved reference
after the original value of C\<*main::foo==$main{foo}> is restored
in the same way hard reference to lexically/dynamically scoped
variables make these variables outlive their scope. If you're
curious you may track the Typeglobs' C\
Returning a temporary Typeglob from a function is used by the
infamous C\<{local FH; open FH\, 'zzz'; return *FH;}>. (Please refer
to L\
C\<\*main::foo> is equivalent to C\<\$main::{foo}>\, for it's is worth\, and operates just as one would expect:
$r = \*foo; $${\*foo} # access $foo $$$r # access $foo %$$r # access %foo # compare: $rr = \$foo; $$rr # access $foo
As C\<\*main::foo> is refering a value in the C\<%main::> cache it sees the localized version if C\<*main::foo> has been localized.
C\<*main::foo> evaluates to C\<'*main::foo'> when being C\
my $a=*::foo; # save (the reference to) the
# current value of $::{foo}
$::foo='global';
print $$a\,"\n"; # prints 'global'
{
local *foo;
$::foo='local';
print $$a\,"\n"; # prints 'global'
}
my $a=\*::foo; # get a reference to $::{foo}
$::foo='global';
print $$$a\,"\n"; # prints 'global'
{
local *foo;
$::foo='local';
print $$$a\,"\n"; # prints 'local'
}
# watch the Typeglob destruction\, prints 0 1 D 2
package Foo;
sub new { bless {}\,shift; }
sub DESTROY { print 'D '; }
package main; our $foo;
my $a=*foo; # save (the reference to) the initial
# typeglob for FOO\, now it has two
# references to it: $::{FOO} and $a
$foo=Foo->new();
print '0 ';
*foo=*boo; # one reference to that left\, $a
print '1 ';
undef $a; # no more references to that typeglob
# remain\, so now it will be destroyed
# together with the object\, 'D' is
# printed
print '2 ';
Flags: category=docs severity=low
Site configuration information for perl v5.7.3:
Configured by anthony at Mon Mar 11 18:43:11 2002.
Summary of my perl5 (revision 5 undef) configuration: Platform: osname=MSWin32\, osvers=4.0\, archname=MSWin32-x86-multi-thread uname='' config_args='undef' hint=recommended\, useposix=true\, d_sigaction=undef usethreads=undef use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=undef usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n\, bincompat5005=undef Compiler: cc='cl'\, ccflags ='-nologo -Gf -W3 -O1 -MD -DNDEBUG -DWIN32 -D_CONSOLE -DNO_STRICT -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -DPERL_MSVCRT_READFIX'\, optimize='-O1 -MD -DNDEBUG'\, cppflags='-DWIN32' ccversion='undef'\, gccversion=''\, gccosandvers='undef' intsize=4\, longsize=4\, ptrsize=4\, doublesize=8\, byteorder=1234 d_longlong=undef\, longlongsize=8\, d_longdbl=define\, longdblsize=10 ivtype='long'\, ivsize=4\, nvtype='double'\, nvsize=8\, Off_t='off_t'\, lseeksize=4 alignbytes=8\, prototype=define Linker and Libraries: ld='link'\, ldflags ='-nologo -nodefaultlib -release -libpath:"c:\perl15173\lib\CORE" -machine:x86' libpth=e:\apps\ds40\VC\lib libs= oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib netapi32.lib uuid.lib wsock32.lib mpr.lib winmm.lib version.lib odbc32.lib odbccp32.lib msvcrt.lib perllibs=undef libc=msvcrt.lib\, so=dll\, useshrplib=yes\, libperl=perl57.lib Dynamic Linking: dlsrc=dl_win32.xs\, dlext=dll\, d_dlsymun=undef\, ccdlflags=' ' cccdlflags=' '\, lddlflags='-dll -nologo -nodefaultlib -release -libpath:"c:\perl15173\lib\CORE" -machine:x86'
Locally applied patches: DEVEL15172
@INC for perl v5.7.3: c:/perl15173/lib c:/perl15173/site/lib .
Environment for perl v5.7.3: HOME=C:\ LANG (unset) LANGUAGE (unset) LC_ALL=EN_US LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=e:\apps\ds40\SharedIDE\BIN;e:\apps\ds40\VC\BIN;e:\apps\ds40\VC\BIN\WINNT;E:\apps\ibm\vaj\eab\bin;C:\usr\local\bin\;e:\Program Files\ibm\gsk5\lib;E:\APPS\ROSE\RATION~1\NUTCROOT\bin;E:\APPS\ROSE\RATION~1\NUTCROOT\bin\x11;E:\APPS\ROSE\RATION~1\NUTCROOT\mksnt;e:\java\sun\java131\bin;e:\apps\vbroker\jre\Bin;e:\apps\vbroker\Bin;C:\WINNT\system32;C:\WINNT;c:\util;E:\apps\CacheSys\Bin;C:\Program Files\rksupport;C:\WINNT\ton\bin;E:\apps\rose\common;E:\apps\rose\Rational Test;E:\apps\borland\delphi\Bin;E:\apps\borland\delphi\Projects\Bpl;E:\apps\ibm\IBM\IMNNQ;E:\apps\ibm\db2p\BIN;E:\apps\ibm\db2p\FUNCTION;E:\apps\ibm\db2p\SAMPLES\REPL;E:\apps\ibm\db2p\HELP;e:\apps\ibm\websphere\bin;G:\MSVC50\VC\BIN;G:\MSVC50\VC\BIN\WINNT PERL_BADLANG (unset) SHELL (unset)
In article \10048932891\.20020314012525@​motor\.ru\, Anton Tagunov \tagunov@​motor\.ru wrote:
I see a place for this in perlmod.pod just before
=head2 Package Constructors and Destructors
but have no reasons to object against putting it somewhere else. (To a separate perlglobtut.pod?)
I have no opinion on where best place for this is.
With best regards and warmest wishes\, - Anton Tagunov
--------------------------------------------------------
Don't be mislead by C\<print(*main::foo)> printing C\<'*main::foo'> and C\<*main::foo eq 'main::foo'> evaluating to C\<1>.
'*main::foo'
It's best to imagine C\<*main::foo> as an instance of special complex datatype. Let us use C\
syntax:
No\, please. Seems to me you can describe a record without resorting to another language.
TYPE Typeglob = RECORD SCALAR\, ARRAY\, HASH\, CODE\, IO : REFERENCE; PACKAGE\, NAME : STRING /*immutable*/; END;
Thy following almost equivalent expressions serve as variables strictly typed to Typeglob:
Saying Typeglob makes me think it is a special package name (like ref(qr//) eq 'Regexp'). How about all lowercase typeglob.
*main::foo
$main::{foo}
$main::{'foo'}(The first one is always of the Typeglob type\, and the later may be either L\
or be of the Typeglob type.) Typeglob assignments are always done by reference and never by value.
Not sure what you mean by 'by reference'. Not sure what you mean by 'Typeglob assignments'. Only thing special I know of is assigning a reference into a typeglob (i.e. *FOO = \$x) only assigns into the slot for the reference's type. Otherwise it is a normal scalar assignment\, whether there is a typeglob on the left\, right\, or both\, which copies any fields available in the right value (IV/UV\, NV\, PV\, GP\, RV\, ... ) to the extent possible.
As Perl code is compiled and identifiers encourted the C\<\< %\<package-name>:: >> hash is filled with (references to) newly created Typeglobs. The same identifier in the same package never causes a Typeglob to be created twice.
sub foo $foo %foo @foo &foo *foo foo
all trigger Typeglob creation\, while
$main::{foo} $main::{'foo'}
don't.
Typeglobs form a separate scalar data type in Perl and are allowed everywhere regular scalars are allowed. They may become values of scalar variables\, members of complex data structures\, be passed in and out of functions (as parameters and results)\, be assigned to each other and back to the C\<*main::foo>/C\<$main::{foo}> expressions (equivalent expressions are grouped together):
*main::foo = *main::bar ; *main::foo = $main::{bar} ; $main::{foo} = *main::bar ; $main::{foo} = $main::{bar} ;
$a = *main::foo ; $a = $main::{foo} ;
$c = [ 0\, 1\, *other::baz ];
&examine( *other::baz ) ; &examine( $other::{'baz'} ) ;
=for comment Are there any ways but the natural one and localizing
Do you mean 'of localizing'?
*foo to create Typeglob objects? Probably no\, spell it out here?
Symbol::gensym returns a reference to a newly created typeglob that is not connected to any symbol table.
All Typeglob assignments are done I\
. This means that after any assignment of a Typeglob value both the source and the destination share the same Typeglob object. Same holds true if a typeglob is being passed from/into a function.
Now I know what you meant 'by reference'. You need to clarify 'typeglob assignments' as assigning a typeglob to a scalar (which may or may not be another typeglob).
If you have a expression that evaluates to a Typeglob value you may access its slots via the following notation:
${expr} # SCALAR %{expr} # HASH @{expr} # ARRAY $#{expr # ARRAY\, access the last element index &{expr} # CODE expr # IO handle
That last is a bareword. In the presence of a * prototype it is silently changed to *expr\, which is a typeglob. There is no way to access just the IO handle but *FOO{IO}.
The limitations imposed over the expressions are the same as those imposed over expressions evaluating to hard references to be dereferencable via S\<C\<$\, %\, @\, $#\, &>\,> see L\<prelref|perlref>.
If you mean you can say $x = *FOO; %$x\, @$x\, $$x\, &$x but for more complex expression in place of $x you need curlies around it\, you should just say so. The reader may not be able to distinguish what you are referring to in perlref.
# provided that the 'main::foo' identifier has been seen by the # compiler till the current moment *main::foo and $main::{foo}
s/till the current moment/before/
# are strictly equivalent (modulo a slight the performance)
?? Did you mean 'a slight performance difference'?
# # that's why expressions in each of the following pairs are # equivalent # ${*main::foo} ${$main::{foo}} # same as $foo %{*main::foo} %{$main::{foo}} # same as %foo @{*main::foo} @{$main::{foo}} # same as @foo &{*main::foo} &{$main::{foo}} # same as &foo readline(*main::foo) readline($main::{foo}) # readline(foo)
Again\, no way to get the just IO handle. That's the whole glob there. I'd omit the readline line.
# an example with a complex data structure $main::foo='foo'; @other::bar=(0\,1\,2\,3); $outer{OU} = [ 14\, { IN => *main::foo }\, $other::{bar} ]; print ${$outer{OU}[1]{IN}}\, $#{$outer{OU}[2]}; #prints foo3
#an example of passing Typeglobs in/out of a function our ($foo\,$bar)=('foo'\,'bar'); sub a{ (*bar\,${shift()}) } my @a=&a(*foo); print ${$a[0]}\, $a[1];
Another notation related to typeglobs is
*foo{CODE} # equivalent to \&foo *foo{SCALAR} # equivalent to \$foo *foo{HASH} # equivalent to \%foo *foo{ARRAY} # equivalent to \@foo *foo{CODE} # equivalent to \&foo *foo{IO} # equivalent to \foo
Nope. \foo is a reference to the return value(s) of foo() if a sub foo is defined. Otherwise it is a reference to the string "foo" (with a warning).
it allowes to obtain hard references to slots in C\<*main::foo> Typeglob variable. Unlike the previous one this notation does works only with a literal C\<*foo> and is not applicable to general expressions evaluating to Typeglobs. You can partially bypass this limitation and obtain references to the slots of a Typeglob by doing:
\# let's assume my $v=\\\*foo;
\${expr} # then \$$v gets \$foo \%{expr} # then \$%v gets \%foo \@{expr} # then \$@v gets \@foo \&{expr} # then \$&v gets \&foo
but there's no workaround to obtain a reference to the C\
slot (C\<\foo> in our example).
Nope. You can say *$x{IO} or *{get_a_typeglob()}{IO} or *{$x->{typglb}}{IO} No workaround needed.
That's as far as I can read right now. I may comment on the rest later. A couple closing comments: you mention the *main::foo \<-> $main::{foo} equivalence repeatedly. Why not omit all that and start the document by saying (only much more fleshed out):
Packages are just symbol tables. Symbol tables are hashes. Values in symbol table hashes are called typeglobs. You can access typeglobs either through the symbol table hash( C\< $main::{foo} > ) or directly\, with a '*' prefix (C \< *main::foo >). (In the latter case\, the entry is created at compile time. In the former\, it will not exist unless a $main::foo\, sub main::foo\, etc. exists.) typeglobs have multiple slots: SCALAR\, CODE\, etc. These are normally accessed by $foo\, &foo\, etc. but if you have a typeglob t you can say *{t}{SCALAR} *{t}{CODE} etc.\, which is equivalent. (The {} around t are optional if it is a simple scalar.) You can assign typeglobs to scalars and pass them to subs or return them from subs just like any other scalar.
So far I haven't seen anything that mentions the curious fact that you can use typeglobs and refs to typeglobs pretty interchangably.
Migrated from rt.perl.org#8839 (status was 'resolved')
Searchable as RT8839$