mauke / Function-Parameters

Function::Parameters - define functions and methods with parameter lists ("subroutine signatures")
https://metacpan.org/pod/Function::Parameters
18 stars 19 forks source link

Hints hash should not store references #37

Closed atoomic closed 1 year ago

atoomic commented 5 years ago

This was discussed already on RT:

But I think it worth giving this some visibility there. It's a bad idea to store references in the Hash Hint a.k.a. %^H

A quick grep show that both the .pm and .xs code need some fixup

Changes:32:          - rewrite pragma implementation and the way %^H is used
Parameters.xs:2172:            croak("%s: internal error: $^H{'%s'} not a hashref: %"SVf, MY_PKG, HINTK_CONFIG, SVfARG(sv));
Parameters.xs:2182:            croak("%s: internal error: $^H{'%s'}{'%.*s'} not a hashref: %"SVf, MY_PKG, HINTK_CONFIG, (int)kw_len, kw_ptr, SVfARG(sv));
Parameters.xs:2205:        croak("%s: internal error: $^H{'%s'}{'%.*s'}{'%s'} not set", MY_PKG, HINTK_CONFIG, (int)kw_len, kw_ptr, HINTSK_ ## NAME); \
Parameters.xs:2218:            croak("%s: internal error: $^H{'%s'}{'%.*s'}{'%s'} not a coderef: %"SVf, MY_PKG, HINTK_CONFIG, (int)kw_len, kw_ptr, HINTSK_REIFY, SVfARG(sv));
Parameters.xs:2234:                    croak("%s: internal error: $^H{'%s'}{'%.*s'}{'%s'}: expected '$', found '%.*s'", MY_PKG, HINTK_CONFIG, (int)kw_len, kw_ptr, HINTSK_SHIFT, (int)(sv_p_end - p), p);
Parameters.xs:2238:                    croak("%s: internal error: $^H{'%s'}{'%.*s'}{'%s'}: expected idfirst, found '%.*s'", MY_PKG, HINTK_CONFIG, (int)kw_len, kw_ptr, HINTSK_SHIFT, (int)(sv_p_end - p), p);
Parameters.xs:2246:                    croak("%s: internal error: $^H{'%s'}{'%.*s'}{'%s'}: can't use global $_ as a parameter", MY_PKG, HINTK_CONFIG, (int)kw_len, kw_ptr, HINTSK_SHIFT);
Parameters.xs:2252:                            croak("%s: internal error: $^H{'%s'}{'%.*s'}{'%s'}: %"SVf" can't appear twice", MY_PKG, HINTK_CONFIG, (int)kw_len, kw_ptr, HINTSK_SHIFT, SVfARG((*ppspec)->shift.data[i].name));
Parameters.xs:2269:                            croak("%s: internal error: $^H{'%s'}{'%.*s'}{'%s'} not an arrayref: %"SVf, MY_PKG, HINTK_CONFIG, (int)kw_len, kw_ptr, HINTSK_SHIF2, SVfARG(sv));
Parameters.xs:2274:                        croak("%s: internal error: $^H{'%s'}{'%.*s'}{'%s'}: tix [%ld] out of range [%ld]", MY_PKG, HINTK_CONFIG, (int)kw_len, kw_ptr, HINTSK_SHIFT, (long)tix, (long)(av_len(shift_types) + 1));
Parameters.xs:2278:                        croak("%s: internal error: $^H{'%s'}{'%.*s'}{'%s'}: tix [%ld] doesn't exist", MY_PKG, HINTK_CONFIG, (int)kw_len, kw_ptr, HINTSK_SHIFT, (long)tix);
Parameters.xs:2282:                        croak("%s: internal error: $^H{'%s'}{'%.*s'}{'%s'}: tix [%ld] is not an object (%"SVf")", MY_PKG, HINTK_CONFIG, (int)kw_len, kw_ptr, HINTSK_SHIFT, (long)tix, SVfARG(type));
Parameters.xs:2289:                        croak("%s: internal error: $^H{'%s'}{'%.*s'}{'%s'}: expected ' ', found '%.*s'", MY_PKG, HINTK_CONFIG, (int)kw_len, kw_ptr, HINTSK_SHIFT, (int)(sv_p_end - p), p);
lib/Function/Parameters.pm:351:    my %config = %{$^H{+HINTK_CONFIG} // {}};
lib/Function/Parameters.pm:378:    $^H{+HINTK_CONFIG} = \%config;
lib/Function/Parameters.pm:385:        delete $^H{+HINTK_CONFIG};
lib/Function/Parameters.pm:389:    my %config = %{$^H{+HINTK_CONFIG}};
lib/Function/Parameters.pm:391:    $^H{+HINTK_CONFIG} = \%config;

It appears that this is the only CPAN module to store HINTK_CONFIG there so this should be safe, to move it to a better location.

https://grep.metacpan.org/search?size=20&q=HINTK_CONFIG&qd=&qft=

Maybe a GV in Function::Parameters itself?

mauke commented 1 year ago

The problem can be reproduced without Function::Parameters or Regexp::Grammars:

#!/usr/bin/env perl
use strict;
use warnings;

use re 'eval';
use overload ();
BEGIN {
    overload::constant qr => sub { 
        qq{(?{ "\\N{EURO SIGN}" })}
    };
}

use charnames qw(:full);
qr/a/;
__END__

Result:

Undefined subroutine &main::CODE(0x564e9441abe0) called at (eval 5) line 1.

I cribbed the use of %^H from charnames.pm. If Function::Parameters is broken, then so is the charnames pragma. (I still need to understand WTF is going on in this code, though.)

PS: HINTK_CONFIG is an internal constant; you'd have to search for Function::Parameters/config to find real users: https://grep.metacpan.org/search?q=Function%3A%3AParameters%2Fconfig&qd=&qft=

mauke commented 1 year ago

I've reported it as https://github.com/Perl/perl5/issues/20950 for now.

demerphq commented 1 year ago

I dont really get what you want to do here.

BEGIN {
    overload::constant qr => sub { 
        qq{(?{ "\\N{EURO SIGN}" })}
    };
}

Is it deliberate you are using (?{ ... }) and not (??{ ... }) ?

I dont really get what you are trying to do here. You want to change all qr//'s to return a deferred pattern match the euro-sign?

It feels like there are multiple ways to do what you want to do that would work.

mauke commented 1 year ago

@demerphq As mentioned in https://rt.cpan.org/Ticket/Display.html?id=128044, what the user was trying to do was:

use Function::Parameters;
use Regexp::Grammars;
qr{ <grammar:Sympa::Scenario> }x;

Which results in the this error:

Function::Parameters: internal error: $^H{'Function::Parameters/config'} not a hashref: HASH(0x56276feec8c0) at (eval 7) line 1.

As you can see on https://metacpan.org/release/DCONWAY/Regexp-Grammars-1.058/source/lib/Regexp/Grammars.pm, the Regexp::Grammars module is 2700 lines of code, so naturally I tried to reduce the code to something simpler first.

Regexp::Grammars does many different things. Among them:

The code for the latter looks like

        return qq{(?{
            warn "Can't match directly against a pure grammar: <grammar: $grammar_name>\n";
        })(*COMMIT)(?!)};

So at this point, we have an overloaded qr handler that returns an object that has an overloaded "" (stringification) handler that returns a regex fragment that, when executed, dynamically calls warn and always fails to match.

This is what triggers the error inside Function::Parameters. However, despite what Damian claims, you can put references in %^H. In fact, this is documented as part of the public interfaces of the core charnames pragma:

The mechanism of translation of \N{...} escapes is general and not hardwired into charnames.pm. A module can install custom translations (inside the scope which uses the module) with the following magic incantation:

sub import {
    shift;
    $^H{charnames} = \&translator;
}

Function::Parameters uses a similar mechanism. Thus, as a general rule, any code that breaks because Function::Parameters puts references in %^H also breaks when charnames is used.

That's why my reduced code uses charnames instead of Function::Parameters. All other elements were taken from Regexp::Grammars. I later realized that overloading is not necessary to reproduce the issue:

$ perl -e 'use re "eval"; use charnames ":full"; my $x = qq<(?{ "\\N{COMMA}" })>; qr/$x/'
Undefined subroutine &main::CODE(0x56439fd0a6d0) called at (eval 5) line 1.

All you need is dynamically eval'd regex code that tries to read what should be a reference from %^H. This can be the charnames handler (triggered by \N{...} in double-quotish context) or Function::Parameter's config settings for the current lexical scope (triggered by its keyword plugin and any bareword in the code, such as warn ).

mauke commented 1 year ago

I've put a workaround into Function::Parameters. If the %^H entry is found not to be a reference, the former error is now downgraded to a warning (which can be disabled with no warnings 'Function::Parameters') and Function::Parameters disables itself in that scope (but could be re-enabled with an explicit use Function::Parameters).

BTW, the reason why this error even happens for qr{ <grammar:Sympa::Scenario> }x; (which contains no keywords) is a bit convoluted: