Perl-Critic / PPI

53 stars 44 forks source link

Support 5.20 signatures #194

Closed shumphrey closed 2 years ago

shumphrey commented 7 years ago

Currently PPI treats perl 5.20 signatures as PPI::Token::Prototype

I had a look through the code to see how this works. PPI::Token::Whitespace returns a Prototype if the previous tokens imply it should be.

What is the best approach to support Signatures? Should there be a PPI::Token::Signature and should Prototype.pm return 'Signature' if it encounters a non prototype character? [^\$\@\%\;]

iynehz commented 7 years ago

+1

This issue is affecting Perl::Critic https://github.com/Perl-Critic/Perl-Critic/issues/591

glauschwuffel commented 6 years ago

Hi,

are there any news on this topic? I'd implement it if the approach described above is the right one.

kberov commented 6 years ago

For Perl::Critic to not complain when signatures are used, the workaround is still to put [-Subroutines::ProhibitSubroutinePrototypes] in .perlcriticrc.

glauschwuffel commented 6 years ago

So, is the approach described above the way to go? If so, I'd go ahead and give it a try.

kberov commented 6 years ago

@glauschwuffel I don't know. @adamkennedy should know or someone involved in the project.

wchristian commented 6 years ago

I'll have a think on this over the weekend.

EvanCarroll commented 5 years ago

Any progress on this?

wchristian commented 5 years ago

I never said which weekend. 😆

That said, we have THUNK on it.

And it's pretty fucked. In the completely context-free case there is no accurate way to determine whether it's a prototype of a signature, due to:

Thus as far as the goal of "allow the policy to determine whether to complain about it being an old-style policy" goes, PPI can't actually help.

The primary way you should go about it is try and have the policy check for signatures being imported into the code space, at which point all prototypes ARE signatures, and prototypes look like attributes.

PPI could maybe be extended to answer the question if a prototype smells like a signature, which could be a number of factors like:

It must also be kept in mind though that import analysis is fraught on account of imports being chainable through use()/require() of other modules and even the possibility of function calls that use()/require() modules into the caller space.

So, thoughts on this?

EvanCarroll commented 5 years ago

Being that signatures won't be experimental forever, perhaps we could simplify smells and look for non-sigils

If this acceptable I'll look at patching.

kberov commented 5 years ago

"… look for non-sigils…"

jtk18 commented 5 years ago

I was researching a fix for this issue and I stumbled upon this text here

To avoid ambiguity, when signatures are enabled the special syntax for prototypes is disabled. There is no attempt to guess whether a parenthesised group was intended to be a prototype or a signature.

So, according to that blurb, prototypes have a different syntax when signatures are enabled. I created the following tests:

#!/usr/bin/env perl

use strict;
use warnings;

sub blah($) {
}

blah();

Which outputs:

Not enough arguments for main::blah at ./proto.pl line 9, near "()"
Execution of ./proto.pl aborted due to compilation errors.
#!/usr/bin/env perl

use strict;
use warnings;

use feature qw(signatures);

sub blah($) {
}

blah();

Which outputs:

The signatures feature is experimental at ./signatures.pl line 8.
Too few arguments for subroutine 'main::blah' at ./signatures.pl line 11.

One would have to do a deeper dive for a better confirmation, but this at least shows that enabling the signatures feature changes what message is outputted.

toddr commented 3 years ago

Seems like PPR fixed it. It may be that we can only solve 98% of this. That would be better than nothing at all. Perfect is the enemy of good enough.

https://rt.cpan.org/Public/Bug/Display.html?id=125616

oalders commented 3 years ago

I think I would be happy with "good enough" on this issue.

FGasper commented 3 years ago

Hi all.

What is needed for motion on this one? This is a continual source of “itch”.

toddr commented 3 years ago

Probably a Pull request?

toddr commented 3 years ago

@oalders submitted this https://github.com/Perl-Critic/PPI/pull/257

oalders commented 3 years ago

That basically just tries to prevent the file from being reformatted when signatures get parsed. It doesn't do the heavy lifting of actually trying to make sense of the signature.

jackdeguest commented 2 years ago

Any update on this? Because a simple code like my $k = sub ($f = foo()) {}; would turn into this PPI structure where PPI fails to recognise this anonymous subroutine with a signature.

PPI::Document
  PPI::Statement::Variable
    PPI::Token::Word    'my'
    PPI::Token::Whitespace      ' '
    PPI::Token::Symbol      '$k'
    PPI::Token::Whitespace      ' '
    PPI::Token::Operator    '='
    PPI::Token::Whitespace      ' '
    PPI::Token::Word    'sub'
    PPI::Token::Whitespace      ' '
    PPI::Token::Prototype   '($f = foo()'
  PPI::Statement::UnmatchedBrace
    PPI::Token::Structure   ')'
  PPI::Token::Whitespace    ' '
  PPI::Statement
    PPI::Structure::Constructor     { ... }
    PPI::Token::Structure   ';'
oalders commented 2 years ago

We need:

1) A proposal on how to do this so that we can get feedback from @wchristian on whether the approach would be acceptable 2) A volunteer to do the work

toddr commented 2 years ago

@oalders now PPI has been resolved does something need to be done here or are we ok to close this?

oalders commented 2 years ago

@toddr nothing has really changed since my previous comment. PPI no longer changes the formatting of signatures when saving a document (which is a good step forward), but it still doesn't really know what a signature is.

My understanding is that this would be a non-trivial (but not impossible) task. I don't know what the status of @wchristian is these days but in the absence of input from @wchristian I would say that if someone can come up with a good proposal and a couple of others could review it before implementation, then we could take this on.

With the way things are moving, I think it's going to be an increasing pain point that PPI can't really deal with this. If there's some company that could donate a developer or sponsor someone to do the work, that might get it moving. (That's not directed at any company in particular).

EvanCarroll commented 2 years ago

I would say that if someone can come up with a good proposal and a couple of others could review it before implementation, then we could take this on.

Is this a good proposal:

Example

sub foo :prototype(prototype context) { }
sub bar (signature context) { }
sub baz :prototype(prototype context) (signature context) { }
oalders commented 2 years ago

There will likely be a lot of cases where PPI needs to parse something, but it doesn't have the use statements, since it can parse fragments of code. I think that has historically been the biggest blocker.

EvanCarroll commented 2 years ago

If so, that sounds problematic since a pragma, especially features and experimental, can change the parsing semantics of anything. Perhaps that should be a bigger issue that we look at then.

As to the "fragments of code" I don't see this as being relevant. If the fragment includes the use v5.36 the parsing semantics are changed. So even if PPI parses fragments it would seem the fragment must include the use statement which changes semantics? That is to say, a fragment will parse differently depending on the document it's contained in, so I'm not sure how a fragment can be useful (unless it includes the environment).

oalders commented 2 years ago

I pass fragments of code to https://metacpan.org/dist/App-PPI-Dumper/view/script/ppi_dumper pretty regularly. If I want to see how PPI parses something without including hundreds of lines of code, I can do that with this tool. There's nothing to stop me from passing

sub foo ($one, $two) { ... }
PPI::Document
  PPI::Statement::Sub
    PPI::Token::Word    'sub'
    PPI::Token::Whitespace      ' '
    PPI::Token::Word    'foo'
    PPI::Token::Whitespace      ' '
    PPI::Token::Prototype       '($one, $two)'
    PPI::Token::Whitespace      ' '
    PPI::Structure::Block       { ... }
      PPI::Token::Whitespace    ' '
      PPI::Statement
        PPI::Token::Operator    '...'
      PPI::Token::Whitespace    ' '
  PPI::Token::Whitespace        '\n'

That's a valid use case.

Grinnz commented 2 years ago

PPI cannot depend on context to determine if the signatures feature is enabled. Its only possible job is to successfully parse them as a prototype-or-signature. It's up to perlcritic rules and other higher level code to have context based heuristics to make a further determination.

Grinnz commented 2 years ago

There is currently no foolproof method for determining the parsing state of dynamic Perl code with static parsing, which is a common issue with PPI based parsing. There's really nothing that can be done about that except come up with something smarter yet still static, which LeoNerd has had thoughts on.

Grinnz commented 2 years ago

There will likely be a lot of cases where PPI needs to parse something, but it doesn't have the use statements, since it can parse fragments of code. I think that has historically been the biggest blocker.

and more commonly, the signatures feature can be enabled by any use statement or BEGIN block. PPI can't possibly approach this problem.

EvanCarroll commented 2 years ago

and more commonly, the signatures feature can be enabled by any use statement or BEGIN block. PPI can't possibly approach this problem.

At this point, you may as well just argue against features, experimental, and Perl's versioning. In a different world, we'd not have any of this, and a new version of Perl like 5.32 would just break backwards compat with older perls, and the parsing semantic would obviously be such that it must be different between two versions of Perl.

Now we have a model where we don't have to break backwards compat with older perl. It seems like an awkward position to argue from that we should NOT be able to statically analyze Perl under newer semantics because we either,

  1. Allow the features to be turned on a different way.
  2. Allow the features to be turned off afterward (in the same package).

By any degree, this is another case where Perl in striving to meet the 0.0001% of use cases missing what the users want and moreover what they'd already expect to happen.

Grinnz commented 2 years ago

I'm not arguing against anything. This is a PPI-specific problem due to Perl's dynamic parser, and it can't be fixed by hoping.

oalders commented 2 years ago

By any degree, this is another case where Perl in striving to meet the 0.0001% of use cases missing what the users want and moreover what they'd already expect to happen.

Right, maybe there's a middle ground somewhere. For my use case (at $work), I can basically assume that any internal code is either a) using signatures or b) at least not using prototypes. I'd like for some way to let PPI know about this. I don't know if that's an ENV var or an argument to the constructor or both, but it would be nice to have the option of forcing PPI to DTRT, even if it doesn't have the full context of the code it's dealing with.

wchristian commented 2 years ago

@EvanCarroll PPI will not, ever, abandon the ability to successfully parse even partial, truncated, corrupted or faulty Perl Documents. The documentation explains why: https://metacpan.org/pod/PPI#DESCRIPTION

If what you want requires abandoning that ability, then you will need to use a different module. Maybe PPR?

Outside of that i am happy to consider proposals and pull requests for how to extend PPI functionality without breaking that ability. Olafs suggestions in the post above strike me as useful, and might even be worth implementing as experimental features.

On the other hand, as @Grinnz correctly indicates, by parsing a sig-or-prot into a relevant ambiguous object, code that actually processes the PPI tree can decide on its own how to treat it. As long as PPI communicates back that a document part could be a signature, the wrapping code can decide that it will indeed treat it as one.

While i'm thinking about it, i think the first useful step would be for PPI to TRY to parse any prototype also as a signature, and if successful, indicate in the prototype object that a signature parse is available.

EvanCarroll commented 2 years ago

even partial, truncated, corrupted or faulty Perl Documents

Respectfully, I don't understand what that means. To avoid this discussion which clearly is political, let's assume Perl 5.40 (or whatever) adds Corina. In doing so Perl will see role, class, method elevated to keywords, and a whole new parsing syntax for these keywords. This will almost certainly be hidden behind a feature flag. From a mere fragment of code from Perl 5.40, you will never know whether or not these keywords are provided by Corina or they are made by Depry McDerpFace in a seldom-used Object Constructor module from 1995.

You're again given a few options,

Again, I don't see how this is disputable. I know I'm talking to smart people here. But when feature flags change parsing semantics, abstractly it doesn't make sense to parse a fragment divorced from them.

To the extent that you can tinker with the underlying parsing semantics without using the feature flag, the simple answer there is don't support that. That's a much simpler answer then not supporting the language semantics as they're intended to be used.

Grinnz commented 2 years ago

This is not a political discussion. We are explaining PPI does not and cannot work with context. There is simply no possibility of acknowledging feature flags with PPI itself.

adamkennedy commented 2 years ago

If the presence of the barewords role, class and method don't make any real sense in a document OTHER than as the new keywords for most documents most of the time, I'd probably go down the assume they are Corina direction.

Although we never had use statements alter semantics, it's never been completely impossible to do so.

There's really only 3 things that are completely non-negotiable. You can't alter tokenization, so Acme::Bleach and Acme::Pony are out of the question, you can't alter nested structure construction, and you can't alter statement boundaries.

Theoretically, with sufficient work and willingness for some perf hits, from memory I always thought it was possible in principle to type the structures and statements once they had been assembled. But in this case, typing the following doesn't really work.

class {
   ...
}

That's why the lookalike modules always had to put a semi-colon at the end of the look-alike things, because they were really a function, a block parameter, and a statement terminator.

So it's not completely off limits, but I think in practice you are fucked.

elcamlost commented 2 years ago

Probably, PPI can accept additional options, like "signatures_enabled => 1"? perl-critic and other tools can work with context and pass context knowledge to PPI.

The main goal of this feature for me is enable critic policy that signatures are required. Not the creating universal abitlity to detect if signatures are enabled or not for any code snippet passed to PPI.

So maybe this problem can be solved for bounded context which makes sense for the majority of users.

wchristian commented 2 years ago

@elcamlost Please review what i said here: https://github.com/Perl-Critic/PPI/issues/194#issuecomment-1168044274 :) The more i think about it, honestly, i find that an option like signatures_enabled is at the wrong level of fine-grained-ness, and an object indicating parse ambiguity, leaving it to the consuming code to decide how to handle it, would be the correct decision.

@EvanCarroll I'll be blunt, while i haven't been able to put much time into PPI recently due to various reasons which ARE politically grounded (hello perl foundation), the limitations of PPI that i explained are not political at all, but raw and technical. As the current primary maintainer i have put many painful hours into considering the nature of Perl parsing and how to adjust to changing realities without breaking existing use cases. That said, yes, keywords are a valid question, but also a completely different beast from signatures.

I've created a ticket with a suggestion for that here: #273

wchristian commented 2 years ago

I'm closing this, as we now have a proof-of-concept: https://github.com/Perl-Critic/PPI/pull/280