Perl / perl5

šŸŖ The Perl programming language
https://dev.perl.org/perl5/
Other
1.88k stars 531 forks source link

possible invalid syntax error generated #5837

Closed p5pRT closed 12 years ago

p5pRT commented 21 years ago

Migrated from rt.perl.org#16249 (status was 'resolved')

Searchable as RT16249$

p5pRT commented 21 years ago

From chrisallan@uk.ibm.com

Created by chrisallan@uk.ibm.com

When I run the following script I get a syntax error. However\, when I run it without the print statement I don't get a syntax error. As far as I can tell I shouldn't get an error in either case\, but I would have thought that the print wouldn't generate an error.

print ''; eval create_code_to_eval();

I've tested it on Activestate Perl 5.6 on Windows\, Perl 5.6.0 on HP-UX and 5.005_03 on Solaris\, all of which behave in the same way.

Apologies if this isn't a bug and I'm doing something stupid...

If you need to contact me and my email address no longer works\, please write to jjrscott@​uk.ibm.com

Perl Info ``` Flags: category=core severity=medium Site configuration information for perl v5.6.1: Configured by allanc1 at Wed Jan 2 17:16:07 2002. Summary of my perl5 (revision 5 version 6 subversion 1) configuration: Platform: osname=MSWin32, osvers=4.0, archname=MSWin32-x86-multi-thread uname='' config_args='undef' hint=recommended, useposix=true, d_sigaction=undef usethreads=undef use5005threads=undef useithreads=define usemultiplicity=define useperlio=undef d_sfio=undef uselargefiles=undef usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef Compiler: cc='cl', ccflags ='-nologo -O1 -MD -DNDEBUG -DWIN32 -D_CONSOLE -DNO_STRICT -DHAVE_DES_FCRYPT -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DPERL_MSVCRT_READFIX', optimize='-O1 -MD -DNDEBUG', cppflags='-DWIN32' ccversion='', gccversion='', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=undef, longlongsize=8, d_longdbl=define, longdblsize=10 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=4 alignbytes=8, usemymalloc=n, prototype=define Linker and Libraries: ld='link', ldflags ='-nologo -nodefaultlib -release -libpath:"C: \Perl\lib\CORE" -machine:x86' libpth="C:\Program Files\Microsoft Visual Studio\VC98\mfc\lib" "C: \Program Files\Microsoft Visual Studio\VC98\lib" "C:\Program Files\IBM\Websphere MQ\Tools\Lib" "C:\Perl\lib\CORE" libs= oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib netapi32.lib uuid.lib wsock32.lib mpr.lib winmm.lib version.lib odbc32.lib odbccp32.lib msvcrt.lib perllibs= oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib netapi32.lib uuid.lib wsock32.lib mpr.lib winmm.lib version.lib odbc32.lib odbccp32.lib msvcrt.lib libc=msvcrt.lib, so=dll, useshrplib=yes, libperl=perl56.lib Dynamic Linking: dlsrc=dl_win32.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' ' cccdlflags=' ', lddlflags='-dll -nologo -nodefaultlib -release -libpath:"C:\Perl\lib\CORE" -machine:x86' Locally applied patches: ACTIVEPERL_LOCAL_PATCHES_ENTRY @INC for perl v5.6.1: C:/Perl/lib C:/Perl/site/lib . Environment for perl v5.6.1: HOME (unset) LANG (unset) LANGUAGE (unset) LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=C:\texmf\miktex\bin;c:\openssl\bin;C:\Perl\bin\;C:\MQTest\bin;C: \Perl\bin\;C:\WINNT\system32;C:\WINNT;C:\WINNT\System32\Wbem;C: \Utilities;C:\Program Files\IBM\Trace Facility;C:\Program Files\Personal Communications;C:\Notes;C:\Program Files\IBM\MQSeries\bin;C:\Program Files\IBM\MQSeries\tools\c\samples\bin;C:\Program Files\ObjREXX;C:\Program Files\ObjREXX\OODIALOG;i:\cts\cts-L020114_D\ibd\bin;c:\vslick\win;C: \MQTest\bin;C:\Program Files\Microsoft Visual Studio\Common\Tools\WinNT;C: \Program Files\Microsoft Visual Studio\Common\MSDev98\Bin;C:\Program Files\Microsoft Visual Studio\Common\Tools;C:\Program Files\Microsoft Visual Studio\VC98\bin;C:\CMVC\exe;C:\CMVC\exe\bin;C:\perltidy-20010903; PERL_BADLANG (unset) SHELL (unset) Chris Allan Websphere MQ Test IBM Hursley, Mail point: 211 Hursley Park, Winchester, SO21 2JN Tele: 245229 Ext: 01962 815229 chrisallan@uk.ibm.com ```
p5pRT commented 21 years ago

From @schwern

On Fri\, Aug 16\, 2002 at 01​:44​:12PM -0000\, Chris Allan wrote​:

When I run the following script I get a syntax error. However\, when I run it without the print statement I don't get a syntax error. As far as I can tell I shouldn't get an error in either case\, but I would have thought that the print wouldn't generate an error.

print ''; eval create_code_to_eval();

I've tested it on Activestate Perl 5.6 on Windows\, Perl 5.6.0 on HP-UX and 5.005_03 on Solaris\, all of which behave in the same way.

I can't get a syntax error from the code above\, just a run-time error because create_code_to_eval() doesn't exist.

--

Michael G. Schwern \schwern@​pobox\.com http​://www.pobox.com/~schwern/ Perl Quality Assurance \perl\-qa@​perl\.org Kwalitee Is Job One Home of da bomb

p5pRT commented 21 years ago

From chrisallan@uk.ibm.com

Hi\,

I've checked the code I sent you\, and I only get the error when I put a space before the parentheses eg

print ''; eval create_code_to_eval ();

(I hadn't realised this when I sent the original code\, sorry) I've been having problems with this code on 5 or 6 different operating systems so I don't think it's a problem with a particular build or environment. Without the print statement it compiles fine.

Hope this helps.

Chris Allan Websphere MQ Test IBM Hursley\, Mail point​: 211 Hursley Park\, Winchester\, SO21 2JN Tele​: 245229 Ext​: 01962 815229 chrisallan@​uk.ibm.com

---------+----------------------------> Michael G Schwern (via RT) \<perlbug@​perl.org > Sent by​: \<rt@​x1.develooper .com>
08/17/2002 10​:21
AM
Please respond to
perlbug
---------+---------------------------->
  >--------------------------------------------------------------------------------------------------------------
 
  To​: Chris Allan/UK/IBM@​IBMGB
  cc​:
  Subject​: Re​: [perl #16249] possible invalid syntax error generated
 
 
  >--------------------------------------------------------------------------------------------------------------

On Fri\, Aug 16\, 2002 at 01​:44​:12PM -0000\, Chris Allan wrote​:

When I run the following script I get a syntax error. However\, when I run it without the print statement I don't get a syntax error. As far as I can tell I shouldn't get an error in either case\, but I would have thought that the print wouldn't generate an error.

print ''; eval create_code_to_eval();

I've tested it on Activestate Perl 5.6 on Windows\, Perl 5.6.0 on HP-UX and 5.005_03 on Solaris\, all of which behave in the same way.

I can't get a syntax error from the code above\, just a run-time error because create_code_to_eval() doesn't exist.

--

Michael G. Schwern \schwern@&#8203;pobox\.com http​://www.pobox.com/~schwern/ Perl Quality Assurance \perl\-qa@&#8203;perl\.org Kwalitee Is Job One Home of da bomb

p5pRT commented 21 years ago

From @gbarr

On Sat\, Aug 17\, 2002 at 02​:19​:36AM -0700\, Michael G Schwern wrote​:

On Fri\, Aug 16\, 2002 at 01​:44​:12PM -0000\, Chris Allan wrote​:

When I run the following script I get a syntax error. However\, when I run it without the print statement I don't get a syntax error. As far as I can tell I shouldn't get an error in either case\, but I would have thought that the print wouldn't generate an error.

print ''; eval create_code_to_eval();

I've tested it on Activestate Perl 5.6 on Windows\, Perl 5.6.0 on HP-UX and 5.005_03 on Solaris\, all of which behave in the same way.

I can't get a syntax error from the code above\, just a run-time error because create_code_to_eval() doesn't exist.

This is repeatable with 5.8 if you place a space before the ()

[gbarr@​zipper​:/home/gbarr]$ perl
eval create_code_to_eval (); __END__ Undefined subroutine &main​::create_code_to_eval called at - line 1. [gbarr@​zipper​:/home/gbarr]$ perl print ''; eval create_code_to_eval (); __END__ syntax error at - line 2\, near "create_code_to_eval (" Execution of - aborted due to compilation errors.

Graham.

p5pRT commented 21 years ago

@gbarr - Status changed from 'new' to 'open'

p5pRT commented 12 years ago

From @wolfsage

I believe this is happening because toke.c isn't tracking things as well as it should be. The attached patch 'fixes' this\, but I question whether it's right or not.

It fixes the issue by overwriting PL_last_lop_op in UNIBRACK\, just as UNI2() does.

This is necessary because later code in the parser looks at PL_last_lop_op and if it matches certain criteria (which OP_PRINT matches)\, it parses the eval some_thing (); differently than it should​:

  /* See if it's the indirect object for a list operator. */

  if (PL_oldoldbufptr &&   PL_oldoldbufptr \< PL_bufptr &&   (PL_oldoldbufptr == PL_last_lop   || PL_oldoldbufptr == PL_last_uni) &&   /* NO SKIPSPACE BEFORE HERE! */   (PL_expect == XREF ||   ((PL_opargs[PL_last_lop_op] >> OASHIFT)& 7) == OA_FILEREF))   {   bool immediate_paren = *s == '(';

  /* (Now we can afford to cross potential line boundary.) */   s = SKIPSPACE2(s\,nextPL_nextwhite); #ifdef PERL_MAD   PL_nextwhite = nextPL_nextwhite; /* assume no & deception */ #endif

  /* Two barewords in a row may indicate method call. */

  if ((isIDFIRST_lazy_if(s\,UTF) || *s == '$') &&   (tmp = intuit_method(s\, gv\, cv))) {   op_free(rv2cv_op);   if (tmp == METHOD && !PL_lex_allbrackets &&   PL_lex_fakeeof > LEX_FAKEEOF_LOWLOGIC)   PL_lex_fakeeof = LEX_FAKEEOF_LOWLOGIC;   return REPORT(tmp);   }

  /* If not a declared subroutine\, it's an indirect object. */   /* (But it's an indir obj regardless for sort.) */   /* Also\, if "_" follows a filetest operator\, it's a bareword */

  if (   ( !immediate_paren && (PL_last_lop_op == OP_SORT ||   (!cv &&   (PL_last_lop_op != OP_MAPSTART &&   PL_last_lop_op != OP_GREPSTART))))   || (PL_tokenbuf[0] == '_' && PL_tokenbuf[1] == '\0'   && ((PL_opargs[PL_last_lop_op] & OA_CLASS_MASK) == OA_FILESTATOP))   )   {   PL_expect = (PL_last_lop == PL_oldoldbufptr) ? XTERM : XOPERATOR;   goto bareword;   }   }

The reason I'm not sure if this is the right fix or not (or maybe it is and it's unfortunate that it is) is that it seems we rely too much on the coder having to know when to set/clear PL_last_lop and PL_last_lop_op.

These don't appear to be documented well so I'm not absolutely certain of their purpose\, when they are used\, etc etc\, (and it gets even more confusing with oldbufptr and oldoldbufptr...)

If someone could shed some light on these\, that'd be awesome.

Thanks\,

-- Matthew Horsfall (alh)

p5pRT commented 12 years ago

From @wolfsage

0001-For-16249-Overwrite-PL_last_lop_op-when-eval-is-call.patch ```diff From bd3bc669b066183013b2648758f7717fcd30c0a1 Mon Sep 17 00:00:00 2001 From: Matthew Horsfall (alh) Date: Sun, 11 Dec 2011 17:16:47 -0500 Subject: [PATCH] For #16249 - Overwrite PL_last_lop_op when eval() is called. Otherwise, parsing later on down the road may use the previous value, which, if it was OP_PRINT, causes the parser to fail --- t/lib/warnings/toke | 7 +++++++ toke.c | 1 + 2 files changed, 8 insertions(+), 0 deletions(-) diff --git a/t/lib/warnings/toke b/t/lib/warnings/toke index 25d53a0..4df4fcf 100644 --- a/t/lib/warnings/toke +++ b/t/lib/warnings/toke @@ -998,3 +998,10 @@ print "ok\n" if $@ =~ /Can't find string terminator "\xab" anywhere before EOF/; EXPECT ok +######## +# toke.c +# [perl #16249] +print ''; +eval this_method_is_fake (); +EXPECT +Undefined subroutine &main::this_method_is_fake called at - line 4. diff --git a/toke.c b/toke.c index 2c29c58..26b0fa6 100644 --- a/toke.c +++ b/toke.c @@ -294,6 +294,7 @@ static const char* const lex_state_names[] = { pl_yylval.ival = f; \ PL_bufptr = s; \ PL_last_uni = PL_oldbufptr; \ + PL_last_lop_op = f; \ if (*s == '(') \ return REPORT( (int)FUNC1 ); \ s = PEEKSPACE(s); \ -- 1.7.0.4 ```
p5pRT commented 12 years ago

From @cpansprout

On Sun Dec 11 14​:26​:00 2011\, alh wrote​:

I believe this is happening because toke.c isn't tracking things as well as it should be. The attached patch 'fixes' this\, but I question whether it's right or not.

It fixes the issue by overwriting PL_last_lop_op in UNIBRACK\, just as UNI2() does.

This is necessary because later code in the parser looks at PL_last_lop_op and if it matches certain criteria (which OP_PRINT matches)\, it parses the eval some_thing (); differently than it should​:

            /\* See if it's the indirect object for a list operator\. \*/

            if \(PL\_oldoldbufptr &&
                PL\_oldoldbufptr \< PL\_bufptr &&
                \(PL\_oldoldbufptr == PL\_last\_lop
                 || PL\_oldoldbufptr == PL\_last\_uni\) &&
                /\* NO SKIPSPACE BEFORE HERE\! \*/
                \(PL\_expect == XREF ||
                 \(\(PL\_opargs\[PL\_last\_lop\_op\] >> OASHIFT\)& 7\) ==

OA_FILEREF)) { bool immediate_paren = *s == '(';

                /\* \(Now we can afford to cross potential line

boundary.) */ s = SKIPSPACE2(s\,nextPL_nextwhite); #ifdef PERL_MAD PL_nextwhite = nextPL_nextwhite; /* assume no & deception */ #endif

                /\* Two barewords in a row may indicate method call\. \*/

                if \(\(isIDFIRST\_lazy\_if\(s\,UTF\) || \*s == '$'\) &&
                    \(tmp = intuit\_method\(s\, gv\, cv\)\)\) \{
                    op\_free\(rv2cv\_op\);
                    if \(tmp == METHOD && \!PL\_lex\_allbrackets &&
                            PL\_lex\_fakeeof > LEX\_FAKEEOF\_LOWLOGIC\)
                        PL\_lex\_fakeeof = LEX\_FAKEEOF\_LOWLOGIC;
                    return REPORT\(tmp\);
                \}

                /\* If not a declared subroutine\, it's an indirect

object. */ /* (But it's an indir obj regardless for sort.) */ /* Also\, if "_" follows a filetest operator\, it's a bareword */

                if \(
                    \( \!immediate\_paren && \(PL\_last\_lop\_op ==

OP_SORT || (!cv && (PL_last_lop_op != OP_MAPSTART && PL_last_lop_op != OP_GREPSTART)))) || (PL_tokenbuf[0] == '_' && PL_tokenbuf[1] == '\0' && ((PL_opargs[PL_last_lop_op] & OA_CLASS_MASK) == OA_FILESTATOP)) ) { PL_expect = (PL_last_lop == PL_oldoldbufptr) ? XTERM : XOPERATOR; goto bareword; } }

The reason I'm not sure if this is the right fix or not (or maybe it is and it's unfortunate that it is) is that it seems we rely too much on the coder having to know when to set/clear PL_last_lop and PL_last_lop_op.

These don't appear to be documented well so I'm not absolutely certain of their purpose\, when they are used\, etc etc\, (and it gets even more confusing with oldbufptr and oldoldbufptr...)

If someone could shed some light on these\, that'd be awesome.

That looks correct to me\, but I donā€™t think I understand this code any better than you do.

--

Father Chrysostomos

p5pRT commented 12 years ago

From @wolfsage

On Sun\, Dec 11\, 2011 at 7​:23 PM\, Father Chrysostomos via RT \< perlbug-followup@​perl.org> wrote​:

On Sun Dec 11 14​:26​:00 2011\, alh wrote​: [...]

The reason I'm not sure if this is the right fix or not (or maybe it is and it's unfortunate that it is) is that it seems we rely too much on the coder having to know when to set/clear PL_last_lop and PL_last_lop_op.

These don't appear to be documented well so I'm not absolutely certain of their purpose\, when they are used\, etc etc\, (and it gets even more confusing with oldbufptr and oldoldbufptr...)

If someone could shed some light on these\, that'd be awesome.

That looks correct to me\, but I donā€™t think I understand this code any better than you do.

What really bugs me about this piece is​:

  /* See if it's the indirect object for a list operator. */

  if (PL_oldoldbufptr &&   PL_oldoldbufptr \< PL_bufptr &&   (PL_oldoldbufptr == PL_last_lop   || PL_oldoldbufptr == PL_last_uni) &&   /* NO SKIPSPACE BEFORE HERE! */   (PL_expect == XREF ||   ((PL_opargs[PL_last_lop_op] >> OASHIFT)& 7) == OA_FILEREF))

We enter this block if PL_oldoldbufptr == PL_last_lop or PL_last_uni\, but then we still match if PL_opargs[PL_last_lop_op].

That just doesn't seem right...

-- Matthew Horsfall (alh)

p5pRT commented 12 years ago

From @Hugmeir

On Sun Dec 11 16​:29​:57 2011\, alh wrote​:

On Sun\, Dec 11\, 2011 at 7​:23 PM\, Father Chrysostomos via RT \< perlbug-followup@​perl.org> wrote​:

On Sun Dec 11 14​:26​:00 2011\, alh wrote​: [...]

The reason I'm not sure if this is the right fix or not (or maybe it is and it's unfortunate that it is) is that it seems we rely too much on the coder having to know when to set/clear PL_last_lop and PL_last_lop_op.

These don't appear to be documented well so I'm not absolutely certain of their purpose\, when they are used\, etc etc\, (and it gets even more confusing with oldbufptr and oldoldbufptr...)

If someone could shed some light on these\, that'd be awesome.

That looks correct to me\, but I donā€™t think I understand this code any better than you do.

What really bugs me about this piece is​:

            /\* See if it's the indirect object for a list 

operator. */

           if \(PL\_oldoldbufptr &&
               PL\_oldoldbufptr \< PL\_bufptr &&
               \(PL\_oldoldbufptr == PL\_last\_lop
                || PL\_oldoldbufptr == PL\_last\_uni\) &&
               /\* NO SKIPSPACE BEFORE HERE\! \*/
               \(PL\_expect == XREF ||
                \(\(PL\_opargs\[PL\_last\_lop\_op\] >> OASHIFT\)& 7\) ==

OA_FILEREF))

We enter this block if PL_oldoldbufptr == PL_last_lop or PL_last_uni\, but then we still match if PL_opargs[PL_last_lop_op].

That just doesn't seem right...

-- Matthew Horsfall (alh)

Seems like this patch is stuck because no one can quite be sure that it'll work right. So who do we have to bug for a CPAN smoke? : D

--hugmeir

p5pRT commented 12 years ago

From @nwc10

On Wed\, Jun 27\, 2012 at 12​:39​:43PM -0700\, Brian Fraser via RT wrote​:

Seems like this patch is stuck because no one can quite be sure that it'll work right. So who do we have to bug for a CPAN smoke? : D

"If it smokes OK\, apply it" sounds somewhat like admitting defeat.

(In that\, if a CPAN smoker became 100% deterministically reliable\, you could eliminate the human\, and just auto-apply any patches that didn't create regressions. Which clearly isn't right.*)

I think it's stuck because everyone hopes that someone else understands the tokeniser better than them\, and that that other person will look at it first.

I'm also sticking my head above the parapet here because I'm confident that I *don't* know the tokeniser at all well\, and that at least 3 people (whom I don't want to name as that seems unfair) actually do know it better than me (although at least 2 of them might not think so).

This an the instance of a more general problem - the core's code is complex\, due to tight\, sometimes unintuitive and ill-understood interaction between the parts. (Thanks at least partly to >18 years of organic growth. "If I were you\, sir\, I wouldn't start from here")

The result is that no-one is confident that they understand it well\, particularly as the more you get to know it\, the more ways you know it can surprise you. And it doesn't all fit in anyone's head. So anything that requires confidence of understanding is actually quite a hard task. Unfortunately\, that means things like reviewing seemingly innocent patches like this.

Nicholas Clark

* But to be fair\, automating detection of *wrongness* would be a big win.

p5pRT commented 12 years ago

From @cpansprout

I have since learnt the lexer a little better and am convinced this patch is correct\, so I have applied it as 5dc13276b28.

(In fact\, youā€™ll see that Larry fixed UNI the same way in commit a0d0e21ea6​:

@​@​ -144\,6 +105\,7 @​@​ static I32 nexttoke = 0;   expect = XTERM\, \   bufptr = s\, \   last_uni = oldbufptr\, \ + last_lop_op = f\, \   (*s == '(' || (s = skipspace(s)\, *s == '(') ? (int)FUNC1 : (int)UNIOP) )

#define UNIBRACK(f) return(yylval.ival = f\, \ )

On Sun Dec 11 16​:29​:57 2011\, alh wrote​:

On Sun\, Dec 11\, 2011 at 7​:23 PM\, Father Chrysostomos via RT \< perlbug-followup@​perl.org> wrote​:

On Sun Dec 11 14​:26​:00 2011\, alh wrote​: [...]

The reason I'm not sure if this is the right fix or not (or maybe it is and it's unfortunate that it is) is that it seems we rely too much on the coder having to know when to set/clear PL_last_lop and PL_last_lop_op.

Thatā€™s why the macros take care of setting it. It gets clear\, as far as I can tell\, whenever the buffer is overwritten with the next line.

These don't appear to be documented well so I'm not absolutely certain of their purpose\, when they are used\, etc etc\,

PL_last_uniā€™s main purpose in life seems to be the ambiguity warning. PL_last_lop is used in various places to see whether the previous token was a listop.

(and it gets even more confusing with oldbufptr and oldoldbufptr...)

The local variable s is the current cursor\, which is carried from one invocation of yylex to the next via PL_bufptr.

PL_oldbufptr is set to PL_bufptr at the beginning of yylex\, and PL_oldoldbufptr to PL_oldbufptr (but not in that order).

So\, in general PL_oldbufptr points to the beginning of the current token\, and PL_oldoldbufptr the previous token.

If someone could shed some light on these\, that'd be awesome.

That looks correct to me\, but I donā€™t think I understand this code any better than you do.

What really bugs me about this piece is​:

            /\* See if it's the indirect object for a list operator\. \*/

           if \(PL\_oldoldbufptr &&
               PL\_oldoldbufptr \< PL\_bufptr &&
               \(PL\_oldoldbufptr == PL\_last\_lop
                || PL\_oldoldbufptr == PL\_last\_uni\) &&
               /\* NO SKIPSPACE BEFORE HERE\! \*/
               \(PL\_expect == XREF ||
                \(\(PL\_opargs\[PL\_last\_lop\_op\] >> OASHIFT\)& 7\) ==

OA_FILEREF))

We enter this block if PL_oldoldbufptr == PL_last_lop or PL_last_uni\, but then we still match if PL_opargs[PL_last_lop_op].

That just doesn't seem right...

The PL_last_uni bit seems suspicious. But it seems to be necessary to let things like readdir through.

--

Father Chrysostomos

p5pRT commented 12 years ago

@cpansprout - Status changed from 'open' to 'resolved'