Closed p5pRT closed 17 years ago
When using \<> as glob in scalar context\, if the pattern contains variables to be interpolated\, the iterator is not reset when the value of the variables change. See attached test case for an example of this behaviour.
The behaviour is quite similar to m//og in scalar context\, but in that case is documented (and removing the /o modifier allows the programmer to change that behaviour).
If changing the behaviour of the "\<> as glob" operator is hard\, or confusing\, or whatever\, at least the documentation should be changed to explicitly define the "problem".
Trying to use "glob" in scalar context\, I came across two things which don't seem to do the right thing:
######################
my $file = glob "/usr/src/*"; print "$file\n";
my $file2 = glob "/usr/src/*"; print "$file2\n";
# expected: first 2 files from /usr/src # got: first file from /usr/src\, twice
######################
my $filespec = "/usr/src/*"; for (1..5) { my $file = glob $filespec; print "$file\n"; $filespec = "/usr/local/*"; }
# expected: first file from /usr/src\, then first 4 files from /usr/local # got: first 5 files from /usr/src
######################
On Mon\, Oct 30\, 2006 at 11:43:19AM -0800\, doug @ tierra. net wrote:
Thank you for your bug report.
Trying to use "glob" in scalar context\, I came across two things which don't seem to do the right thing:
######################
my $file = glob "/usr/src/*"; print "$file\n";
my $file2 = glob "/usr/src/*"; print "$file2\n";
# expected: first 2 files from /usr/src # got: first file from /usr/src\, twice
That is the correct behavior. These are two separate calls to glob()\, each generating its own file list.
######################
my $filespec = "/usr/src/*"; for (1..5) { my $file = glob $filespec; print "$file\n"; $filespec = "/usr/local/*"; }
# expected: first file from /usr/src\, then first 4 files from /usr/local # got: first 5 files from /usr/src
This is also the expected behavior. It's documented in perlop\, in the section on I/O Operators:
A (file)glob evaluates its (embedded) argument only when it is starting a new list. All values must be read before it will start over. In list context\, this isn't important because you automatically get them all anyway. However\, in scalar context the operator returns the next value each time it's called\, or "undef" when the list has run out.
Ronald
The RT System itself - Status changed from 'new' to 'open'
On Monday 30 October 2006 14:20\, Ronald J Kimball wrote:
On Mon\, Oct 30\, 2006 at 11:43:19AM -0800\, doug @ tierra. net wrote:
######################
my $filespec = "/usr/src/*"; for (1..5) { my $file = glob $filespec; print "$file\n"; $filespec = "/usr/local/*"; }
# expected: first file from /usr/src\, then first 4 files from /usr/local # got: first 5 files from /usr/src
This is also the expected behavior. It's documented in perlop\, in the section on I/O Operators:
A \(file\)glob evaluates its \(embedded\) argument only when it is starting a new list\. All values must be read before it will start over\. In list context\, this isn't important because you automatically get them all anyway\. However\, in scalar context the operator returns the next value each time it's called\, or "undef" when the list has run out\.
It's not clear to me why the second call to glob() when $filespec contains a different value does not start a new list.
-- c
@rgs - Status changed from 'open' to 'rejected'
On 10/30/06\, chromatic \chromatic@​wgz\.org wrote:
On Monday 30 October 2006 14:20\, Ronald J Kimball wrote:
It's not clear to me why the second call to glob() when $filespec contains a different value does not start a new list.
-- c
What apparently happens is that the glob() function sets up an array which it iterates through\, behind the scenes\, the first time it is called. Which is what it is supposed to do. The tricky part\, to me\, would be how exactly to try and explain that succinctly in the documentation. I stared at it for a spell yesterday and gave up. The referral to File::Glob doesn't really help that much either\, and I did not find documentation of the fact that the glob() you get by default and the glob() you have after using File::Glob are different functions\, as the latter does not do the magic-iteration-in-scalar-context thing.
What apparently happens is that the glob() function sets up an array which it iterates through\, behind the scenes\, the first time it is called.
that's not exactly correct.
# mkdir BLAH; cd BLAH; touch one; touch two; touch three # perl -le 'print ~~glob($count++ or "*") while 1' | head one three two
4
6
8
#
draft document patch attached.
another possibility.
On Tue\, Oct 31\, 2006 at 02:21:45PM -0600\, David Nicol wrote:
What apparently happens is that the glob() function sets up an array which it iterates through\, behind the scenes\, the first time it is called.
that's not exactly correct.
# mkdir BLAH; cd BLAH; touch one; touch two; touch three # perl -le 'print ~~glob($count++ or "*") while 1' | head one three two
4
6
8
#
draft document patch attached.
+The iterator is loaded only when the list is exhausted\, which will +cause repeated calls to an instance of C\<glob("1")> in scalar context +to function as a kind of flip-flop.
The first time the iterator is loaded\, there's no list that has been exhausted. I'm also not sure this example makes the behavior clearer. It assumes that the reader understands what glob("1") does and knows what a flip-flop is.
How does this sound?
Once the glob() has begun\, the pattern argument is ignored until after the current list has been exhausted\, even if the pattern changes in the meantime.
Possibly with an example:
my $pattern = '*.txt'; while (my $file = glob($pattern)) { print "$file\n"; $pattern = '*.jpg'; }
Even though the pattern changes to '*.jpg' after the first file is returned\, the glob() will continue returning files ending in .txt.
Ronald
There's also the possibility of making a very small addition to the entry\, just one more dot to connect into the reader's picture\, such as merely changing
In scalar context\, glob iterates through such filename expansions\, returning undef when the list is exhausted.
to
In scalar context\, glob iterates through such filename expansions\, returning undef when the list is exhausted\, after which EXPR will be evaluated again on the next call.
Which concisely implies that EXPR is not evaluated while the iterator is loaded.
That doesn't address the issue of glob instances in code\, which is the root of the OP's understandable confusion. Revising glob entirely to use a table of active iterators keyed by the string value of (defined(EXPR)?(EXPR):$_) might make it more intuitive but would be a big change to the current semantics compared with one iterator for each appearance of glob in Perl code.
I have been calling that appearance an "instance" but I'm not sure if that is the correct term...
There's also the issue of the iterative semantics going away after
use File::Glob ':glob';
which was also unexpected and could be addressed in the docs.
On Tue\, Oct 31\, 2006 at 04:21:51PM -0600\, David Nicol wrote:
There's also the possibility of making a very small addition to the entry\, just one more dot to connect into the reader's picture\, such as merely changing
In scalar context\, glob iterates through such filename expansions\, returning undef when the list is exhausted\.
to
In scalar context\, glob iterates through such filename expansions\, returning undef when the list is exhausted\, after which EXPR will be evaluated again on the next call\.
Which concisely implies that EXPR is not evaluated while the iterator is loaded.
I don't think it's right to say that EXPR is not evaluated. As your example showed\, EXPR is always evaluated. It's just that glob() will ignore the result.
Ronald
Ronald J Kimball wrote:
Once the glob() has begun\, the pattern argument is ignored until after the current list has been exhausted\, even if the pattern changes in the meantime.
I know I'll get a spanking from the backwards compatibility police\, but...
Instead of documenting a weird and dangerous behavior\, why don't we fix it so it Does The Right Thing? Simply store the file pattern from which the list is generated. If a new pattern is passed in generate a new list.
The current behavior of glob() seems accidentally inherited from \<*.c> where it wasn't possible for the pattern of the op to change. When glob() came into being and variable globs became possible\, this behavior was discovered it was documented rather than fixed.
FWIW I just got bitten by overzealous glob caching (though of a different type) not half an hour ago after writing this in a subroutine...
sub has_tests { my $has_tests = glob("t/*.t") ? 1 : 0;
return $has_tests; }
On 10/31/06\, Ronald J Kimball \rjk\-perl\-p5p@​tamias\.net wrote:
I don't think it's right to say that EXPR is not evaluated. As your example showed\, EXPR is always evaluated. It's just that glob() will ignore the result.
Right you are. Maybe I should have stopped after I stared at it for a while and couldn't make headway without increasing its size substantially.
untested:
package Improved::glob; use File::Glob (); my %cache; sub glob(_){ exists $cache{$_[0]} and return shift @{ (@{$cache{$_[0]} > 1) ? $cache{$_[0]} : (delete $cache{$_[0]})}; $cache{$_[0]} = [File::Glob::bsd_glob($_[0])]; return shift @{$cache{$_[0]}}; };
sub import {*{caller().'::glob'} = \&glob}; 1; __END__
};
Migrated from rt.perl.org#40622 (status was 'rejected')
Searchable as RT40622$