Closed shumphrey closed 2 years ago
+1
This issue is affecting Perl::Critic https://github.com/Perl-Critic/Perl-Critic/issues/591
Hi,
are there any news on this topic? I'd implement it if the approach described above is the right one.
For Perl::Critic to not complain when signatures are used, the workaround is still to put
[-Subroutines::ProhibitSubroutinePrototypes]
in .perlcriticrc
.
So, is the approach described above the way to go? If so, I'd go ahead and give it a try.
@glauschwuffel I don't know. @adamkennedy should know or someone involved in the project.
I'll have a think on this over the weekend.
Any progress on this?
I never said which weekend. 😆
That said, we have THUNK on it.
And it's pretty fucked. In the completely context-free case there is no accurate way to determine whether it's a prototype of a signature, due to:
sub meep (xxxx) {
, so any other distinguishing features can be ignored as they'll be edge casesThus as far as the goal of "allow the policy to determine whether to complain about it being an old-style policy" goes, PPI can't actually help.
The primary way you should go about it is try and have the policy check for signatures being imported into the code space, at which point all prototypes ARE signatures, and prototypes look like attributes.
PPI could maybe be extended to answer the question if a prototype smells like a signature, which could be a number of factors like:
It must also be kept in mind though that import analysis is fraught on account of imports being chainable through use()/require() of other modules and even the possibility of function calls that use()/require() modules into the caller space.
So, thoughts on this?
Being that signatures won't be experimental forever, perhaps we could simplify smells and look for non-sigils
If this acceptable I'll look at patching.
"… look for non-sigils…"
I was researching a fix for this issue and I stumbled upon this text here
To avoid ambiguity, when signatures are enabled the special syntax for prototypes is disabled. There is no attempt to guess whether a parenthesised group was intended to be a prototype or a signature.
So, according to that blurb, prototypes have a different syntax when signatures are enabled. I created the following tests:
#!/usr/bin/env perl
use strict;
use warnings;
sub blah($) {
}
blah();
Which outputs:
Not enough arguments for main::blah at ./proto.pl line 9, near "()"
Execution of ./proto.pl aborted due to compilation errors.
#!/usr/bin/env perl
use strict;
use warnings;
use feature qw(signatures);
sub blah($) {
}
blah();
Which outputs:
The signatures feature is experimental at ./signatures.pl line 8.
Too few arguments for subroutine 'main::blah' at ./signatures.pl line 11.
One would have to do a deeper dive for a better confirmation, but this at least shows that enabling the signatures feature changes what message is outputted.
Seems like PPR fixed it. It may be that we can only solve 98% of this. That would be better than nothing at all. Perfect is the enemy of good enough.
I think I would be happy with "good enough" on this issue.
Hi all.
What is needed for motion on this one? This is a continual source of “itch”.
Probably a Pull request?
@oalders submitted this https://github.com/Perl-Critic/PPI/pull/257
That basically just tries to prevent the file from being reformatted when signatures get parsed. It doesn't do the heavy lifting of actually trying to make sense of the signature.
Any update on this?
Because a simple code like my $k = sub ($f = foo()) {};
would turn into this PPI structure where PPI fails to recognise this anonymous subroutine with a signature.
PPI::Document
PPI::Statement::Variable
PPI::Token::Word 'my'
PPI::Token::Whitespace ' '
PPI::Token::Symbol '$k'
PPI::Token::Whitespace ' '
PPI::Token::Operator '='
PPI::Token::Whitespace ' '
PPI::Token::Word 'sub'
PPI::Token::Whitespace ' '
PPI::Token::Prototype '($f = foo()'
PPI::Statement::UnmatchedBrace
PPI::Token::Structure ')'
PPI::Token::Whitespace ' '
PPI::Statement
PPI::Structure::Constructor { ... }
PPI::Token::Structure ';'
We need:
1) A proposal on how to do this so that we can get feedback from @wchristian on whether the approach would be acceptable 2) A volunteer to do the work
@oalders now PPI has been resolved does something need to be done here or are we ok to close this?
@toddr nothing has really changed since my previous comment. PPI
no longer changes the formatting of signatures when saving a document (which is a good step forward), but it still doesn't really know what a signature is.
My understanding is that this would be a non-trivial (but not impossible) task. I don't know what the status of @wchristian is these days but in the absence of input from @wchristian I would say that if someone can come up with a good proposal and a couple of others could review it before implementation, then we could take this on.
With the way things are moving, I think it's going to be an increasing pain point that PPI
can't really deal with this. If there's some company that could donate a developer or sponsor someone to do the work, that might get it moving. (That's not directed at any company in particular).
I would say that if someone can come up with a good proposal and a couple of others could review it before implementation, then we could take this on.
Is this a good proposal:
use v5.36;
is detected, only interpret prototypes as that which is provided to :prototype
.sub foo :prototype(prototype context) { }
sub bar (signature context) { }
sub baz :prototype(prototype context) (signature context) { }
There will likely be a lot of cases where PPI
needs to parse something, but it doesn't have the use
statements, since it can parse fragments of code. I think that has historically been the biggest blocker.
If so, that sounds problematic since a pragma, especially features
and experimental
, can change the parsing semantics of anything. Perhaps that should be a bigger issue that we look at then.
As to the "fragments of code" I don't see this as being relevant. If the fragment includes the use v5.36
the parsing semantics are changed. So even if PPI parses fragments it would seem the fragment must include the use statement which changes semantics? That is to say, a fragment will parse differently depending on the document it's contained in, so I'm not sure how a fragment can be useful (unless it includes the environment).
I pass fragments of code to https://metacpan.org/dist/App-PPI-Dumper/view/script/ppi_dumper pretty regularly. If I want to see how PPI
parses something without including hundreds of lines of code, I can do that with this tool. There's nothing to stop me from passing
sub foo ($one, $two) { ... }
PPI::Document
PPI::Statement::Sub
PPI::Token::Word 'sub'
PPI::Token::Whitespace ' '
PPI::Token::Word 'foo'
PPI::Token::Whitespace ' '
PPI::Token::Prototype '($one, $two)'
PPI::Token::Whitespace ' '
PPI::Structure::Block { ... }
PPI::Token::Whitespace ' '
PPI::Statement
PPI::Token::Operator '...'
PPI::Token::Whitespace ' '
PPI::Token::Whitespace '\n'
That's a valid use case.
PPI cannot depend on context to determine if the signatures feature is enabled. Its only possible job is to successfully parse them as a prototype-or-signature. It's up to perlcritic rules and other higher level code to have context based heuristics to make a further determination.
There is currently no foolproof method for determining the parsing state of dynamic Perl code with static parsing, which is a common issue with PPI based parsing. There's really nothing that can be done about that except come up with something smarter yet still static, which LeoNerd has had thoughts on.
There will likely be a lot of cases where
PPI
needs to parse something, but it doesn't have theuse
statements, since it can parse fragments of code. I think that has historically been the biggest blocker.
and more commonly, the signatures feature can be enabled by any use
statement or BEGIN block. PPI can't possibly approach this problem.
and more commonly, the signatures feature can be enabled by any use statement or BEGIN block. PPI can't possibly approach this problem.
At this point, you may as well just argue against features
, experimental
, and Perl's versioning. In a different world, we'd not have any of this, and a new version of Perl like 5.32 would just break backwards compat with older perls, and the parsing semantic would obviously be such that it must be different between two versions of Perl.
Now we have a model where we don't have to break backwards compat with older perl. It seems like an awkward position to argue from that we should NOT be able to statically analyze Perl under newer semantics because we either,
By any degree, this is another case where Perl in striving to meet the 0.0001% of use cases missing what the users want and moreover what they'd already expect to happen.
I'm not arguing against anything. This is a PPI-specific problem due to Perl's dynamic parser, and it can't be fixed by hoping.
By any degree, this is another case where Perl in striving to meet the 0.0001% of use cases missing what the users want and moreover what they'd already expect to happen.
Right, maybe there's a middle ground somewhere. For my use case (at $work
), I can basically assume that any internal code is either a) using signatures or b) at least not using prototypes. I'd like for some way to let PPI
know about this. I don't know if that's an ENV
var or an argument to the constructor or both, but it would be nice to have the option of forcing PPI
to DTRT, even if it doesn't have the full context of the code it's dealing with.
@EvanCarroll PPI will not, ever, abandon the ability to successfully parse even partial, truncated, corrupted or faulty Perl Documents. The documentation explains why: https://metacpan.org/pod/PPI#DESCRIPTION
If what you want requires abandoning that ability, then you will need to use a different module. Maybe PPR?
Outside of that i am happy to consider proposals and pull requests for how to extend PPI functionality without breaking that ability. Olafs suggestions in the post above strike me as useful, and might even be worth implementing as experimental features.
On the other hand, as @Grinnz correctly indicates, by parsing a sig-or-prot into a relevant ambiguous object, code that actually processes the PPI tree can decide on its own how to treat it. As long as PPI communicates back that a document part could be a signature, the wrapping code can decide that it will indeed treat it as one.
While i'm thinking about it, i think the first useful step would be for PPI to TRY to parse any prototype also as a signature, and if successful, indicate in the prototype object that a signature parse is available.
even partial, truncated, corrupted or faulty Perl Documents
Respectfully, I don't understand what that means. To avoid this discussion which clearly is political, let's assume Perl 5.40 (or whatever) adds Corina. In doing so Perl will see role
, class
, method
elevated to keywords, and a whole new parsing syntax for these keywords. This will almost certainly be hidden behind a feature flag. From a mere fragment of code from Perl 5.40, you will never know whether or not these keywords are provided by Corina or they are made by Depry McDerpFace in a seldom-used Object Constructor module from 1995.
You're again given a few options,
Again, I don't see how this is disputable. I know I'm talking to smart people here. But when feature flags change parsing semantics, abstractly it doesn't make sense to parse a fragment divorced from them.
To the extent that you can tinker with the underlying parsing semantics without using the feature flag, the simple answer there is don't support that. That's a much simpler answer then not supporting the language semantics as they're intended to be used.
This is not a political discussion. We are explaining PPI does not and cannot work with context. There is simply no possibility of acknowledging feature flags with PPI itself.
If the presence of the barewords role
, class
and method
don't make any real sense in a document OTHER than as the new keywords for most documents most of the time, I'd probably go down the assume they are Corina direction.
Although we never had use
statements alter semantics, it's never been completely impossible to do so.
There's really only 3 things that are completely non-negotiable. You can't alter tokenization, so Acme::Bleach and Acme::Pony are out of the question, you can't alter nested structure construction, and you can't alter statement boundaries.
Theoretically, with sufficient work and willingness for some perf hits, from memory I always thought it was possible in principle to type the structures and statements once they had been assembled. But in this case, typing the following doesn't really work.
class {
...
}
That's why the lookalike modules always had to put a semi-colon at the end of the look-alike things, because they were really a function, a block parameter, and a statement terminator.
So it's not completely off limits, but I think in practice you are fucked.
Probably, PPI can accept additional options, like "signatures_enabled => 1"? perl-critic and other tools can work with context and pass context knowledge to PPI.
The main goal of this feature for me is enable critic policy that signatures are required. Not the creating universal abitlity to detect if signatures are enabled or not for any code snippet passed to PPI.
So maybe this problem can be solved for bounded context which makes sense for the majority of users.
@elcamlost Please review what i said here: https://github.com/Perl-Critic/PPI/issues/194#issuecomment-1168044274 :) The more i think about it, honestly, i find that an option like signatures_enabled is at the wrong level of fine-grained-ness, and an object indicating parse ambiguity, leaving it to the consuming code to decide how to handle it, would be the correct decision.
@EvanCarroll I'll be blunt, while i haven't been able to put much time into PPI recently due to various reasons which ARE politically grounded (hello perl foundation), the limitations of PPI that i explained are not political at all, but raw and technical. As the current primary maintainer i have put many painful hours into considering the nature of Perl parsing and how to adjust to changing realities without breaking existing use cases. That said, yes, keywords are a valid question, but also a completely different beast from signatures.
I've created a ticket with a suggestion for that here: #273
I'm closing this, as we now have a proof-of-concept: https://github.com/Perl-Critic/PPI/pull/280
Currently PPI treats perl 5.20 signatures as PPI::Token::Prototype
I had a look through the code to see how this works. PPI::Token::Whitespace returns a Prototype if the previous tokens imply it should be.
What is the best approach to support Signatures? Should there be a PPI::Token::Signature and should Prototype.pm return 'Signature' if it encounters a non prototype character? [^\$\@\%\;]