Open p5pRT opened 8 years ago
The end marker for a here-document must be the exact string\, without trailing spaces. If the heredoc begins with \<\<END but instead of 'END' you have 'END '\, Perl treats that as part of the quoted string without warning.
I appreciate that this is done to be consistent with Unix tradition. The Unix shell likewise requires an exact line with the terminator string and no trailing space. However\, I suggest that in this day and age it may be worth a small break with tradition in order to catch the mistake of a trailing space after the terminator.
Firstly\, many commonly used text editors do not show any visual difference between a line with trailing space and one without\, so two programs which look identical end up with different behaviour.
Secondly\, we can no longer assume that Perl programmers are familiar with the dark arts of Unix shell scripting\, so they will not know about this 'gotcha'.
Thirdly\, Perl has already broken with the behaviour of the shell in the case of a carriage-return character. A trailing carriage return after the heredoc terminator is silently ignored\, even on platforms where text files use just a newline character for end of line. The Unix shell does not do this (or at least GNU bash doesn't). Given that we have already broken with the classical shell heredoc behaviour for one kind of whitespace character\, why not for others?
In my personal opinion the right fix would be to add a deprecation warning when the heredoc terminator is seen followed by whitespace and end-of-line. 'In future Perl versions this will be treated as the end of the heredoc.' Then that change can indeed be made after a release cycle or two.
An alternative would be to simply add a warning when the heredoc terminator appears but has whitespace after it. Although I appreciate the comments that heredocs are intended to be unrestricted with 'anything' appearing in them\, I feel the case for a warning is even stronger here given the invisibilty of the trailing space characters.
On Wed\, 06 Jul 2016 13:50:48 GMT\, eda@waniasset.com wrote:
This is a bug report for perl from eda@waniasset.com\, generated with the help of perlbug 1.40 running under perl 5.22.2.
----------------------------------------------------------------- [Please describe your issue here]
The end marker for a here-document must be the exact string\, without trailing spaces. If the heredoc begins with \<\<END but instead of 'END' you have 'END '\, Perl treats that as part of the quoted string without warning.
I am unsure as to the specific case you are discussing.
In the attachment\, the first heredoc is defined with '\<\<END;'. The second is defined with '\<\<END ;' In both cases there is no whitespace after the terminator 'END'. Both "work" in the sense that both compile and print -- even though the second definition looks weird.
Is that the case you are concerned about? Or something else?
Thank you very much. -- James E Keenan (jkeenan@cpan.org)
The RT System itself - Status changed from 'new' to 'open'
On Fri\, Sep 29\, 2017 at 07:27:58PM -0700\, James E Keenan via RT wrote:
On Wed\, 06 Jul 2016 13:50:48 GMT\, eda@waniasset.com wrote:
This is a bug report for perl from eda@waniasset.com\, generated with the help of perlbug 1.40 running under perl 5.22.2.
----------------------------------------------------------------- [Please describe your issue here]
The end marker for a here-document must be the exact string\, without trailing spaces. If the heredoc begins with \<\<END but instead of 'END' you have 'END '\, Perl treats that as part of the quoted string without warning.
I am unsure as to the specific case you are discussing.
In the attachment\, the first heredoc is defined with '\<\<END;'. The second is defined with '\<\<END ;' In both cases there is no whitespace after the terminator 'END'. Both "work" in the sense that both compile and print -- even though the second definition looks weird.
Is that the case you are concerned about? Or something else?
I think the perceived problem is with a trailing space after the terminator *after* the here doc. As in:
print \<\< END; This is printed
END This as well\, because the previous line has a trailing space.
END
I'm not sure how big a problem this is in practise. In many cases\, a trailing space will lead to a program which doesn't compile. For the remaining cases\, where both the program compiles and doesn't behave badly\, a linter (Perl::Critic?) ought to do the trick.
Abigail
On Sat\, 30 Sep 2017 16:18:58 GMT\, abigail@abigail.be wrote:
On Fri\, Sep 29\, 2017 at 07:27:58PM -0700\, James E Keenan via RT wrote:
On Wed\, 06 Jul 2016 13:50:48 GMT\, eda@waniasset.com wrote:
This is a bug report for perl from eda@waniasset.com\, generated with the help of perlbug 1.40 running under perl 5.22.2.
----------------------------------------------------------------- [Please describe your issue here]
The end marker for a here-document must be the exact string\, without trailing spaces. If the heredoc begins with \<\<END but instead of 'END' you have 'END '\, Perl treats that as part of the quoted string without warning.
I am unsure as to the specific case you are discussing.
In the attachment\, the first heredoc is defined with '\<\<END;'. The second is defined with '\<\<END ;' In both cases there is no whitespace after the terminator 'END'. Both "work" in the sense that both compile and print -- even though the second definition looks weird.
Is that the case you are concerned about? Or something else?
I think the perceived problem is with a trailing space after the terminator *after* the here doc. As in:
print \<\< END; This is printed
END This as well\, because the previous line has a trailing space.
END
I'm not sure how big a problem this is in practise. In many cases\, a trailing space will lead to a program which doesn't compile. For the remaining cases\, where both the program compiles and doesn't behave badly\, a linter (Perl::Critic?) ought to do the trick.
I agree. I think the feature request should be rejected.
Thank you very much. -- James E Keenan (jkeenan@cpan.org)
On 09/30/2017 07:27 PM\, James E Keenan via RT wrote:
On Sat\, 30 Sep 2017 16:18:58 GMT\, abigail@abigail.be wrote:
On Fri\, Sep 29\, 2017 at 07:27:58PM -0700\, James E Keenan via RT wrote:
On Wed\, 06 Jul 2016 13:50:48 GMT\, eda@waniasset.com wrote:
This is a bug report for perl from eda@waniasset.com\, generated with the help of perlbug 1.40 running under perl 5.22.2.
----------------------------------------------------------------- [Please describe your issue here]
The end marker for a here-document must be the exact string\, without trailing spaces. If the heredoc begins with \<\<END but instead of 'END' you have 'END '\, Perl treats that as part of the quoted string without warning.
I am unsure as to the specific case you are discussing.
In the attachment\, the first heredoc is defined with '\<\<END;'. The second is defined with '\<\<END ;' In both cases there is no whitespace after the terminator 'END'. Both "work" in the sense that both compile and print -- even though the second definition looks weird.
Is that the case you are concerned about? Or something else?
I think the perceived problem is with a trailing space after the terminator *after* the here doc. As in:
print \<\< END; This is printed
END This as well\, because the previous line has a trailing space.
END
I'm not sure how big a problem this is in practise. In many cases\, a trailing space will lead to a program which doesn't compile. For the remaining cases\, where both the program compiles and doesn't behave badly\, a linter (Perl::Critic?) ought to do the trick.
I agree. I think the feature request should be rejected.
Agreed.
Thanks for your responses. This bug report is based on real-world code; a lot of code (repeated SQL calls\, or printing chunks of output) do have repeated heredoc uses\, and if those all use the same delimiter (such as END) it is quite easy to silently include a trailing space -- not shown in most editors -- and end up with code that compiles but is wrong.
I filed the bug report because it's quite awkward that a space character (moreover\, an end-of-line one\, which are the most difficult to see) has such a large effect. But forget the suggestion of adding a warning. That gets bogged down in the usual back and forth about perlcritic (which I love\, and run over the codebase regularly -- but not in every single edit-test cycle). Could I ask instead whether the whitespace-sensitivity here is really needed?
What would happen if heredoc delimiters became like every other language token and weren't affected by space characters after them? They will always be special because they do depend on start of line\, but couldn't they be more relaxed about whitespace at the end? Nothing else in the language is as fussy\, and we know that with the mixture of different editors in use\, trailing spaces tend to creep into code over time. Wouldn't it be a sensible incremental improvement to allow whitespace after these heredoc terminators?
Note that the current behaviour is not exactly consistent: if you put the start of the heredoc as '\<\<END ' with a trailing space\, it doesn't get terminated by the 'END ' you specified. The trailing space on the starting heredoc delimiter is silently stripped. Why not strip on the closing delimiter too?
On Sun\, Oct 01\, 2017 at 12:10:57PM -0700\, Ed Avis via RT wrote:
Thanks for your responses. This bug report is based on real-world code; a lot of code (repeated SQL calls\, or printing chunks of output) do have repeated heredoc uses\, and if those all use the same delimiter (such as END) it is quite easy to silently include a trailing space -- not shown in most editors -- and end up with code that compiles but is wrong.
I filed the bug report because it's quite awkward that a space character (moreover\, an end-of-line one\, which are the most difficult to see) has such a large effect. But forget the suggestion of adding a warning. That gets bogged down in the usual back and forth about perlcritic (which I love\, and run over the codebase regularly -- but not in every single edit-test cycle). Could I ask instead whether the whitespace-sensitivity here is really needed?
What would happen if heredoc delimiters became like every other language token and weren't affected by space characters after them? They will always be special because they do depend on start of line\, but couldn't they be more relaxed about whitespace at the end? Nothing else in the language is as fussy\, and we know that with the mixture of different editors in use\, trailing spaces tend to creep into code over time. Wouldn't it be a sensible incremental improvement to allow whitespace after these heredoc terminators?
Note that the current behaviour is not exactly consistent: if you put the start of the heredoc as '\<\<END ' with a trailing space\, it doesn't get terminated by the 'END ' you specified. The trailing space on the starting heredoc delimiter is silently stripped. Why not strip on the closing delimiter too?
This does what you expect it to do:
print \<\< "END "; The line below does not end with a space: END The line below does end with a space: END
Abigail
Migrated from rt.perl.org#128557 (status was 'open')
Searchable as RT128557$