Perl / PPCs

This repository is for Requests For Comments - proposals to change the Perl language.
61 stars 22 forks source link

Lexically Require Module #36

Open Ovid opened 1 year ago

Ovid commented 1 year ago

Perl a P5P discussion, this PPC proposes a syntax for lexically requiring modules to avoid fragile transitive dependencies.

Grinnz commented 1 year ago

I find this cognitively difficult, at the implementation and logical levels. I apologize if I'm misunderstanding things or this has been covered in the discussion.

If it's meant to affect only the keyword parser (which could actually be done lexically) then it would not do much of anything because it would not be able to affect code outside of its scope which would just treat Some::Module as a string and defer the lookup to runtime.

If it's meant to affect it at runtime then the effect would be dynamic and not lexical, and this seems incredibly complicated (tearing down the whole package namespace?)

From the user point of view, it seems confusing as it's called "lexically require" yet requiring a module runs arbitrary import methods (and the whole body of the module) which almost always have non-lexical effects.

demerphq commented 1 year ago

If it's meant to affect it at runtime then the effect would be dynamic and not lexical, and this seems incredibly complicated (tearing down the whole package namespace?)

Speaking abstractly and independently from the actual text of ovids proposal I don't see why it would that difficult to maintain a hash of permission data that models "from this package you are allowed to access that package". Every time we did a method lookup we would check the current package, and see if the permission hash allows access to the other package, if it didnt we would throw an exception. Obviously it would slow down such calls, maybe significantly, but perhaps it could be implemented in a way that one need not pay this price in production code.

Grinnz commented 1 year ago

If it's meant to affect it at runtime then the effect would be dynamic and not lexical, and this seems incredibly complicated (tearing down the whole package namespace?)

Speaking abstractly and independently from the actual text of ovids proposal I don't see why it would that difficult to maintain a hash of permission data that models "from this package you are allowed to access that package". Every time we did a method lookup we would check the current package, and see if the permission hash allows access to the other package, if it didnt we would throw an exception.

Performance concerns aside, the fundamental problem here is that the design of this feature is to change behavior everywhere except in its scope, i.e. all existing code would no longer be able to call methods by default. The only way I could see that working is if there was another feature that applied such a restriction in that scope.

demerphq commented 1 year ago

change behavior everywhere except in its scope

Do you mean what I said, or what Ovid wrote? I have only skimmed Ovids text so maybe I missed something. As far as what i wrote I would implement it in such a way that it would only apply if the current package was registered in the top level of the hash. Something like this code:

if (my $allowed = $namespace_permission{$from_pack}) {
  unless ($allowed->{$to_pack}) { 
    die "You cannot call a method in $to_pack from $from_pack without requiring $to_pack explicitly" 
  }
}

So when the user enabled the feature in a given package we would add a subhash under the key $from_pack. When they required a given package we would basically do $namespace_permission{$from_pack}{$to_pack}++.

Am I misunderstanding your point?

Grinnz commented 1 year ago

I'm referring to the feature proposed, which only involves a feature declaration in the scope where the module is required.

demerphq commented 1 year ago

I'm referring to the feature proposed, which only involves a feature declaration in the scope where the module is required.

Which as far as I understand it could be covered by the code I described, and would not impact other namespaces.

demerphq commented 1 year ago

@Grinnz I think i get where you are coming but I think that @Ovid was a touch ambiguous in his wording and worded it such that it might be interpreted as meaning the effect would be broader than i think he intended. If I have package X, which requires Y, and packages Z and W which do not require Y, and Z does not use this feature then it should be allowed to access methods in Y via the Y classname even in W did use this feature and was forbidden from accessing Y because it had not been explicitly required. IOW, the only package affected by this feature should be the package it is used within.

FWIW, I could almost see us saying "this shouldn't be lexically controlled like most features, and we should actually introduce a new keyword like 'namespace' which functioned like package, but had more restrictive semantics. EG, i think it would be awkward if

package Z;
Y->foo(); # legal
use feature `lexical_require`;
Y->foo(); # throws exception

If we had a 'namespace' keyword then it would be clear that this feature affected the entirety of code compiled within the Z namespace. (We might forbid the user of package Z if someone had previously used namespace Z).

wchristian commented 1 year ago

Code outside of that scope cannot use the required module unless it explicitly uses it

I suspect a lot of people would interpret this as follows:

use strict;
use warnings;
package Foo { use feature 'lexical_require'; use Meep 'marp'; marp() }
print $Meep::bar;
Meep::foo();
__END__
Name "Meep::bar" used only once: possible typo at -e line 1.
Use of uninitialized value $Meep::bar in print at -e line 1.
Undefined subroutine &Meep::foo called at -e line 1.

Besides documenting the stance on that (whatever it be), the PPC should explain why the stance is decided as it is.

tonycoz commented 1 year ago

One issue here is that Name->foo() can resolve Name as a file handle, and that is resolved at runtime, so except in the case of no feature "bareword_filehandles" (assuming https://github.com/Perl/perl5/issues/19426 receives a review) any bareword name could be resolved as a handle, making compile-time checking difficult.

The examples all use barewords, what happens with:

my $class = "SomeClass";
$class->somemethod();

if SomeClass isn't visible within the scope.

Of course, $class may have been passed in from elsewhere that has done use SomeClass;.

shadowcat-mst commented 1 year ago

To expand on the case @tonycoz mentions, an exported sub can also be invoked - things like MooseX::Types and Type::Tiny have to provide a certain amount of cleverness to make a class type for DateTime work since e.g. DateTime->new still needs to work when DateTime is a constant sub export that returns a type object.

Note that this extends to colon separated names as well since Foo::Bar->new will call a Foo::Bar() subroutine if one is present, although I don't recall that case being something I've -yet- had to take into account.

shadowcat-mst commented 1 year ago

There's an elephant in the room here - %INC

Testing the contents of %INC is used for more than one purpose, notably:

1) Checking to see if a module is already available (used by e.g. on-demand require() code)

2) Checking to see if a module is being used at all (used by e.g. App::FatPacker and JSON::MaybeXS)

If %INC outside of the lexical-require-using code shows the module then we'll end up thinking a package is available when it isn't and break case 1

If %INC outside of the code -doesn't- show the module then we break case 2

There's also things like type constraints (and other utility code such as is_class_loaded from Class::Load) that test to see if a class name is loaded/valid on behalf of their caller, I can't remember the actual name already on cpan but think

has foo => (is => 'ro', required => 1, isa => AlreadyLoadedClassName);

(and note that variations on this theme will also test $name->can('can') or ->can('isa') so it's not just %INC that we need to worry about for that one)

The fun part here is that for that to work you'd want the loadedness visible dynamically at the very least ... but of course having it dynamically visible in code that invokes user-supplied callbacks would then see the dependency as loaded and maybe accidentally work in the way this PPC is trying to avoid happening, or maybe result in something -thinking- a module is available and recording that in a variable somewhere but then when a different piece of code tries to use that information the program explodes.

demerphq commented 1 year ago

NOTE all of the following comments are written assuming that @Ovid will change the PPC so that this feature only affects code WITHIN the scope it is used. IMO having a feature affect code outside of its scope is a complete non-starter, and we really don't need to discuss all the myriad reasons why. ("Action at a distance is bad" seems a sufficient reason.)

@tonycoz wrote:

One issue here is that Name->foo() can resolve Name as a file handle,

I think this might be one of those "well don't do that then". IOW, under this feature we would assume that any bareword was a class name, even if it wasn't necessarily. Sure that might mean that the code in question couldn't use bareword filehandles, and it couldn't use functions that look like barewords, but IMO who cares? The feature would be opt-in, so if someone wanted to use constructs like that then they would just not use the feature.

@shadowcat-mst wrote:

DateTime->new ... this extends to colon separated names as well since Foo::Bar->new will call a Foo::Bar() subroutine if one is present

Same thing for these scenarios. When using this feature DateTime would be assumed to be a bareword, same thing for Foo::Bar. If you wanted to access DateTime() or Foo::Bar() you would be expected to write the parens. Again, this is an opt-in feature, so it is not like it is going to break existing code unless a developer explicitly asks for the feature.

@shadowcat-mst wrote:

If %INC outside of the lexical-require-using code shows the module then we'll end up thinking a package is available when it isn't and break case 1

Same thing here. The feature is opt-in, if code is messing about with %INC then it simply wouldn't use the feature.

hvds commented 1 year ago

Ovid @.***> wrote: :Perl a P5P discussion, this PPC proposes a syntax for lexically requiring modules to avoid fragile transitive dependencies. :You can view, comment on, or merge this pull request online at: : : https://github.com/Perl/PPCs/pull/36

Without naming it as such, this appears to introduce a new concept of declaring or otherwise registering an interest in a class. While it risks verging on implementation detail, I'd like to understand what statements (or ops) would make a class visible in a given scope.

Currently, for example I can write:

% perl -lE 'sub Foo::bar { print "bar" } Foo->bar; Foo::bar()' bar bar %

Would this break if I had loaded a module that lexically required Foo? What actions other than 'use Foo' make Foo available?

Hugo

Ovid commented 1 year ago

If someone thinks this is interesting and wants to write a better PPC, please go for it. My wallet just got stolen and I have a ton of paperwork to go through. French bureaucracy is performance art. Fortunately, I still have my passport this time.

Edit: In other words, I might not get back to this for a while.

demerphq commented 1 year ago

@ovid sorry to hear that. Ill edit the text as I suggested and you agreed with.

tonycoz commented 1 year ago

To expand on the case @tonycoz mentions, an exported sub can also be invoked - things like MooseX::Types and Type::Tiny have to provide a certain amount of cleverness to make a class type for DateTime work since e.g. DateTime->new still needs to work when DateTime is a constant sub export that returns a type object.

I don't think this is a problem, since that resolution to a sub name is already done at compile-time, depending on the visibility of the name, compare:

$ ./perl -Ilib -E 'sub Foo() { "Test" } package Foo { sub bar { say "InFoo" } } package Test { sub bar { say "InTest" } } Foo->bar()'
InTest
$ ./perl -Ilib -E 'package Foo { sub bar { say "InFoo" } } package Test { sub bar { say "InTest" } } Foo->bar(); sub Foo { "Test" }'
InFoo

So this type of change could be compatible with the current compile-time selection of sub-name vs bareword-as-string, any checks for an imported class name could be done at compile-time after

shadowcat-mst commented 1 year ago

@demerphq the %INC question applies to anything -using- code that uses this feature, so "oh, just don't do that" would make it impossible to use the feature in a CPAN module.

demerphq commented 1 year ago

@shadowcat-mst please explain the problem you are thinking of in more detail. Assuming this is implemented the way I expect it should be I dont see how it could "make it impossible to use the feature in a CPAN module." Yes that could happen if ovids "action at a distance" model was applied, but IMO that is a totally non-starter anyway. As long as this behavior is restricted to code that uses the feature there should be no action at a distance and no affect on code that does not use the feature.

shadowcat-mst commented 1 year ago

@tonycoz the * prototype (and moral equivalents in built-ins) to take a symbol name (which is how bareword filehandles work AFAICR) also involve compile-time triggering, hence my considering the cases to be pretty similar.

I do agree that it's no longer a problem if you're careful to write the implementation in a way that avoids said problem - my goal was to raise that that would be required.

shadowcat-mst commented 1 year ago

@demerphq Since you directly quoted "If %INC outside of the lexical-require-using code" I mistakenly presumed you'd noticed my use of the word "outside" and were responding based on the model that affects things outside.

Obviously, if we go a route that -doesn't- affect anything outside then it isn't going to be an issue to code outside, but if that's what you meant then I'm not sure what you were trying to say.

wchristian commented 1 year ago

Quick question here, since i can't yet go by changed PPC text: When you rewrite the text, @demerphq, will you include the capability to warn/die on My::UnRequired::->meep but without requiring the ::?