rschupp / PAR-Packer

(perl) Generate stand-alone executables, perl scripts and PAR files https://metacpan.org/pod/PAR::Packer
Other
48 stars 13 forks source link

Filter strips __DATA__ section #82

Closed assysttest closed 8 months ago

assysttest commented 8 months ago

Hi, when the script or a module contains a DATA section, the content of DATA is lost after encryption/decryption by Filter::Crypto. It is impossible to use Win32::API::Type with this issue which is quite important for some other stuff. I use pp version 1.057 (PAR version 1.017), and Filter::Crypto 2.10 with Strawberry Perl 5.32.1 on a Windows 10 box.

here a sample 'crypttest.pl':

use strict;
use warnings;

print "Start printing DATA\n";
foreach (<DATA>) {
    print "$_";
}
close DATA;
print "Data ends\n";

exit 0;

__DATA__

Hello
World

Output by use of "pp -f Crypto -F Crypto --module Filter::Crypto::Decrypt crypttest.pl -o crypttest.exe":


D:\Devel>crypttest.exe
Start printing DATA
Data ends

Output by use of "pp crypttest.pl -o crypttest.exe":

D:\Devel>crypttest.exe
Start printing DATA

Hello
World

Data ends
rschupp commented 8 months ago

That is a known limitation, see the note in the docs for pp

Note: Most filters are incompatible with __DATA__ sections in your source.

You can try to use --modfilter Crypto=REGEX with a REGEX that matches the modules names (i.e. Foo/Bar.pm for Foo::Bar) whose source you want to hide, but doesn't match Win32/API/Type.pm

assysttest commented 8 months ago

Hi Roderich, To build an expression to match all but some modules using DATA looks quite complicated. On the other hand: I suppose the normal use of the filter is to hide the sources of own modules. It is not needed to encrypt modules from CPAN. The critical stuff is normally stored in other directories than the CPAN modules. I had a look in the code of PAR::Packer. There is a function __generate_filter to create a filter function "$filtersub". filtersub is called with full path ($ref) and module name. What do you think about extension of conditions in a form:

The hashes in filterlist should be extended by a type. If you think that it could be helpfully give me a message, I would implement it. Bye Torsten

assysttest commented 8 months ago

Closed by mistake.

rschupp commented 8 months ago

To build an expression to match all but some modules using DATA looks quite complicated. On the other hand: I suppose the normal use of the filter is to hide the sources of own modules. It is not needed to encrypt modules from CPAN.

The regular expression is matched against module names as they would appear as keys in %INC, e.g. Foo/Bar.pm for module Foo::Bar [1]. In your case it would have sufficed to exclude all Win32::API:: modules with --modfilter Crypto=^(?!Win32/API/)

[1] The documentation of pp is a bit vague on this, I've added some notes.

assysttest commented 8 months ago

Hi, thanks a lot, this'll solve the problem for Win32::API::Type. I had a look, in vendor/lib of Strawberry Perl I see 17 files with a DATA section, some important modules included. What ever I do it is not really smart and I touch the limits of command line length in Windows. For me it looks better to have a file with a list of expressions for modules to apply the filter and for modules to skip filtering. The result is attached. You may simple replace the _generate_filter function by this code. It is supporting the existing syntax rules and extends the --modfilter option to read filter and deny conditions from a file. Of course you may include it in PAR::Packer


sub _read_filterlist ($$$$$) {
    local $_;
    my ($file,$filter,$filters,$deny,$applied) = @_;
    # read apply rules for a given filter name from file.
    # each line consist of rule and pattern. Supported rules:
    # '=' -> module file match the pattern
    # ':' -> module files must exact match (string eq)
    # '~' -> match with the evaluated pattern
    # '!=', '!:', '!~' -> don't apply the filter for files matching the rules.
    # Examples see below.

    my $fh;
    die "Error opening filter spec '$file' for reading: $^E" unless open $fh,$file;
    my $content = [<$fh>];
    close $fh;
    my $i = 0;
    my $filter_objects = {};

    LINE: foreach my $line (@$content) {
        $i++;
        chomp $line;
        $line =~ s/^\s*//;
        $line =~ s/\s*$//;
        next LINE unless $line =~ /\S/;
        if ($line =~ /^(!*[=:~])(.+)$/) {
            my $type = $1;
            my $pattern = $2;
            my $regex;
            if ($type =~ /=/) {
                $regex = qr/$pattern/;
            } elsif ($type =~ /:/) {
                $regex = $pattern;
            } elsif ($type =~ /~/) {
                $regex = eval $pattern;
                if ($@) {
                    print STDERR "Syntax error in filter spec file '$file', line $i: Error evaluating expression '$pattern': $@\n";
                    next LINE;
                } 
            } else {
                print STDERR "Syntax error in filter spec file '$file', line $i: Unknown type '$type'\n";
                next LINE;
            }

            if ($type =~ /^!(.)$/) {
                $type = $1; 
                push @{$deny->{$filter}}, {
                    regex => ($type =~ /=/) ? qr/$regex/ : $regex,
                    type => $type
                };      
            } else {
                my $filter_object = $filter_objects->{$filter};
                unless (defined $filter_object) {
                    $filter_object = $filter_objects->{$filter} = PAR::Filter->new($filter);
                }
                unless (defined $applied->{$filter}) {
                    $applied->{$filter} = {};
                }
                push @$filters, {
                    regex => ($type =~ /=/) ? qr/$regex/ : $regex,
                    type => $type,
                    filter => $filter_object,
                    filter_name => $filter,
                    deny => $deny->{$filter},
                    applied => $applied->{$filter}
                };
            }
        } else {
            print STDERR "Syntax error in filter spec file '$file', line $i: $_ is not matching qr/^\\s*!*[=:~]/\n";
        }
    }

    return $filters;    

}

sub _generate_filter {
    my $opt = shift; # options hash
    my $key = shift; # F or f? modules or script?

    my $verbatim = ($ENV{PAR_VERBATIM} || 0);

    # list of deny conditions for filters
    my $deny = {
        PatchContent => [],
    }; 

    # store module names to avoid applying of filter multiple times
    # This is aperformance issue (lots of ':' filters) and avoids mistakes.
    my $applied = {
        PatchContent => {},
    }; 

    # List of filters. If the regex is undefined or matches the
    # file name (e.g. Foo/Bar.pm), apply filter to this module.
    my $filters = [
        { 
            regex => undef, 
            filter => PAR::Filter->new('PatchContent'),
            deny => $deny->{PatchContent},
            applied => $applied->{PatchContent}
        },
    ];

    # Example: Use 'Crypto' for all but Win32::API::Type because it contains a __DATA__ section:
    #
    # $deny = {
    #       Crypto => [
    #           {
    #               type => '~', 
    #               regexp => (?^i:Win32/API/Type.pm)  # result of eval qr#Win32/API/Type.pm#i
    #           }
    #       ]
    # } 
    #
    # $filters = [
    #       {
    #           regexp =>  (?^i:\.pm$),
    #           filter =>  PAR::Filter->new('Crypt'),
    #           type => '~',
    #           deny => $deny->{Crypto}
    #           applied => $applied->{Crypto}
    #       }
    # ]
    #
    # The option to get the structure above:
    #
    # --modfilter Crypto:my_file
    #
    # The content of my_file:
    #
    # ~qr/\.pm$/i
    # !~qr#Win32/API/Type.pm#i
    #
    # other possibility for the same effect, case sensitive:
    # 
    # =\.pm$
    # !:Win32/API/Type.pm

    foreach my $option (@{ $opt->{$key} }) {
        my ($filter, $type, $exp);                          
        if ($option =~ /^([^=:]+)([=:])(.*)$/ ) {
            $filter = $1;
            $type = $2;
            $exp = $3;
        } else {
            $filter = $option;
        }

        $deny->{$filter} = [] unless defined $deny->{$filter};

        if (!defined($type) || defined($type) && $type eq '=') {
            $applied->{$filter} = {} unless defined $applied->{$filter};
            push @$filters, {
                regex => (defined $exp ? qr/$exp/ : undef),
                filter => PAR::Filter->new($filter),
                type => '=',
                deny => $deny->{$filter},
                applied => $applied->{$filter},
            };
        } elsif (defined($type) && defined($exp)) {
            _read_filterlist($exp,$filter,$filters,$deny,$applied);
        }
    }

    my $filtersub = sub {
        local $_;
        my $ref = shift;
        my $name = shift;
        my $filtered = 0;
        FILTERSPEC: foreach my $filterspec (@$filters) {

            if ($filterspec->{applied}->{$name}) {
                # Filter is already applied. Take the next.
                next FILTERSPEC;
            }

            my $type = $filterspec->{type};

            my $test = {
                '=' => sub { return $name =~ /$_[0]/ },
                ':' => sub { return $name eq $_[0] },
                '~' => sub { return $name =~ /$_[0]/ },
            };

            my $denied = 0;
            DENY: foreach my $denyspec (@{$filterspec->{deny}}) {
                $denied |= &{$test->{$denyspec->{type}}}($denyspec->{regex});
                # print "Name: $name, Denied: $denied, Reg: $denyspec->{regex}\n";
                last DENY if $denied;
            }

            if ( !$denied 
                    and (not defined($filterspec->{regex}) or &{$test->{$type}}($filterspec->{regex}))
            ) {
                $filtered++;
                $filterspec->{applied}->{$name} = 1;
                $ref = $filterspec->{filter}->apply($ref, $name); 
            }
        }

        # PodStrip by default, overridden by -F or $ENV{PAR_VERBATIM}
        if (!$applied->{PodStrip}->{$name} and $filtered == 1 and not $verbatim) {
            $ref = PAR::Filter::PodStrip->apply($ref, '');
            $applied->{PodStrip}->{$name} = 1;
        }
        return $ref;
    };

    return $filtersub;
}
assysttest commented 7 months ago

Hi Roderich, it would be nice when you make a comment to the proposal, it allowes to specify the modules to filter quite better, lists included and independent of command line length. When you like to use it in PAR::Packer I can adjust my builds. I don't like to use a patched version of PAR::Packer, it would make updates difficult. In this case I use a long regular expression to avoid filtering of modules with DATA section.

rschupp commented 7 months ago

I had a look, in vendor/lib of Strawberry Perl I see 17 files with a DATA section, some important modules included. What ever I do it is not really smart and I touch the limits of command line length in Windows.

You can stuff pp arguments into one or more files using @filename...

Another solution is to create a new filter module derived from PAR::Filter::Crypto that leaves modules that contain __DATA__ unmolested:

package PAR::Filter::MyCrypto;
use parent PAR::Filter::Crypto;

sub apply 
{
    my ($self, $ref, $name) = @_;
    return if $$ref =~ /^__DATA__\r?\n/m;
    $self->SUPER::apply($ref, $name);
}

1;

and then use this with pp -f Crypto -F MyCrypto --module Filter::Crypto::Decrypt ....

As for your patch: thanks, but no thanks. PAR::Packer already has too many options and contains too many hacks. And I definitely don't like your attitude "I don't like to use a patched version of PAR::Packer, it would make updates difficult.", i.e. avoiding work by dumping it on the maintainer.

assysttest commented 7 months ago

Hi Roderich, I think it is a misunderstanding regarding your effort to maintain the "hack". When I make such a proposal I can fix it for the future too. It is quite clear that a fork'll generate more future work than a clean solution. For me it looks like an advatage to specify filters in a file, not only for myself. Anyways, I'll follow yor proposal and write an own filter. Thank you, Bye