Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.91k stars 542 forks source link

EU::ParseXS: ExtUtils/typemap overrides typemap provided by user #19294

Open sisyphus opened 2 years ago

sisyphus commented 2 years ago

Module: ExtUtils::ParseXS

Description

The assertion made in the title is true only when the typemap provided by the user is NOT found by default, but relies on being found via the TYPEMAPS setting in WriteMakefile(). If the typemap provided by the user is found by default, then it overrides ExtUtils/typemap, as expected.

For the purposes of experimentation, the safe and reliable method of ensuring that the typemap is NOT found by default, is to name it anything other than "typemap". It can then be placed anywhere you like, and the only way to have it recognized as a typemap is to use the TYPEMAPS option I just mentioned above. If that user provided typemap (let's call it "my_typemap") is being provided to override a setting specified in ExtUtils/typemap, then you'll be sorely disappointed and left scratching your head as to why that does not happen. You'll examine the compilation output and see that, as expected, there's a -typemap switch specifying the ExtUtils/typemap, followed a by a second -typemap switch specifying "my_typemap" - which suggests that "my_typemap" will override any conflicting setting in ExtUtils/typemap, as specified in the documentation..

What you don't see is that ExtUtils::ParseXS::Utilities::process_typemaps() then secretly appends the list of all defaults to the list of typemap files (which includes "my_typemap") that were already found. As that appendage includes ExtUtils/typemap, the rules of precedence determine that the settings specified in ExtUtils/typemap then override the settings specified in "my_typemap".

My suggested patch to restore DWIMmery is fairly simple:

--- Utilities.pm_orig   2021-12-21 16:02:22 +1100
+++ Utilities.pm        2021-12-22 14:58:53 +1100
@@ -272,7 +272,9 @@
     die "Can't find $typemap in $pwd\n" unless -r $typemap;
   }

-  push @tm, standard_typemap_locations( \@INC );
+  my @s = standard_typemap_locations( \@INC );
+  shift @s if $s[0] eq $tm[0];
+  push @tm, @s;

   require ExtUtils::Typemaps;
   my $typemap = ExtUtils::Typemaps->new;

AFAICS, in practice, the condition $s[0] eq $tm[0] is always true, with both $s[0] and $tm[0] identifying ExtUtils/typemap. But test 1 of t/600-t-compat.t manages to construct a case where it's NOT true, and that test then fails if we unconditionally shift @s

Steps to Reproduce

As a demo, consider the following typemap (named "typemap") which seeks to alter the typing of unsigned char * from T_PV (as specified by ExtUtils/typemap) to T_PTR :

unsigned char *     T_PTR

and, in the same directory as that typemap file, place the following Inline::C script:

use strict;
use warnings;

use Inline C => Config =>
#  TYPEMAPS =>  ['./my_typemap'],
  BUILD_NOISY => 1, # verbose build
  FORCE_BUILD => 1, # re-build whenever the script is run
  CLEAN_AFTER_BUILD => 0, # don't clean up the build directory
;

use Inline C =><<'EOC';

void foo(unsigned char *name) {
  if(name) printf("Ignoring my typemap\n");
  else printf("Honouring my typemap\n");
}

EOC

foo(undef);

Then cd to the directory that contains that script, run it, and after it has gone through the build process, it will output "Honouring my typemap". In this instance, the typemap has (obviously) been found by default, so everything worked as intended.

Then rename "typemap" to "my_typemap" and comment in the TYPEMAPS line in the Inline::C script.

Run it again, and the final output changes to "Ignoring my typemap". Apply the patch to Utilities.pm, run the script again, and see the output return to "Honouring my typemap".

AFAICT, this is not some quirk of the way that Inline::C generates the Makefile.PL and Makefile. Those files look fine to me, and totally in keeping with the Makefile.PL and Makefile that would be found in an XS module build.

Perl configuration

AFAIK, any perl build later than perl-5.8.8 (possibly earlier) will be subject to this.

jkeenan commented 2 years ago

Module: ExtUtils::ParseXS

Description

The assertion made in the title is true only when the typemap provided by the user is NOT found by default, but relies on being found via the TYPEMAPS setting in WriteMakefile(). If the typemap provided by the user is found by default, then it overrides ExtUtils/typemap, as expected.

For the purposes of experimentation, the safe and reliable method of ensuring that the typemap is NOT found by default, is to name it anything other than "typemap". It can then be placed anywhere you like, and the only way to have it recognized as a typemap is to use the TYPEMAPS option I just mentioned above. If that user provided typemap (let's call it "my_typemap") is being provided to override a setting specified in ExtUtils/typemap, then you'll be sorely disappointed and left scratching your head as to why that does not happen. You'll examine the compilation output and see that, as expected, there's a -typemap switch specifying the ExtUtils/typemap, followed a by a second -typemap switch specifying "my_typemap" - which suggests that "my_typemap" will override any conflicting setting in ExtUtils/typemap, as specified in the documentation..

What you don't see is that ExtUtils::ParseXS::Utilities::process_typemaps() then secretly appends the list of all defaults to the list of typemap files (which includes "my_typemap") that were already found. As that appendage includes ExtUtils/typemap, the rules of precedence determine that the settings specified in ExtUtils/typemap then override the settings specified in "my_typemap".

My suggested patch to restore DWIMmery is fairly simple:

--- Utilities.pm_orig   2021-12-21 16:02:22 +1100
+++ Utilities.pm        2021-12-22 14:58:53 +1100
@@ -272,7 +272,9 @@
     die "Can't find $typemap in $pwd\n" unless -r $typemap;
   }

-  push @tm, standard_typemap_locations( \@INC );
+  my @s = standard_typemap_locations( \@INC );
+  shift @s if $s[0] eq $tm[0];
+  push @tm, @s;

   require ExtUtils::Typemaps;
   my $typemap = ExtUtils::Typemaps->new;

AFAICS, in practice, the condition $s[0] eq $tm[0] is always true, with both $s[0] and $tm[0] identifying ExtUtils/typemap. But test 1 of t/600-t-compat.t manages to construct a case where it's NOT true, and that test then fails if we unconditionally shift @s

Steps to Reproduce

As a demo, consider the following typemap (named "typemap") which seeks to alter the typing of unsigned char * from T_PV (as specified by ExtUtils/typemap) to T_PTR :

unsigned char *   T_PTR

and, in the same directory as that typemap file, place the following Inline::C script:

use strict;
use warnings;

use Inline C => Config =>
#  TYPEMAPS =>  ['./my_typemap'],
  BUILD_NOISY => 1, # verbose build
  FORCE_BUILD => 1, # re-build whenever the script is run
  CLEAN_AFTER_BUILD => 0, # don't clean up the build directory
;

use Inline C =><<'EOC';

void foo(unsigned char *name) {
  if(name) printf("Ignoring my typemap\n");
  else printf("Honouring my typemap\n");
}

EOC

foo(undef);

Then cd to the directory that contains that script, run it, and after it has gone through the build process, it will output "Honouring my typemap". In this instance, the typemap has (obviously) been found by default, so everything worked as intended.

Then rename "typemap" to "my_typemap" and comment in the TYPEMAPS line in the Inline::C script.

Run it again, and the final output changes to "Ignoring my typemap". Apply the patch to Utilities.pm, run the script again, and see the output return to "Honouring my typemap".

AFAICT, this is not some quirk of the way that Inline::C generates the Makefile.PL and Makefile. Those files look fine to me, and totally in keeping with the Makefile.PL and Makefile that would be found in an XS module build.

Perl configuration

AFAIK, any perl build later than perl-5.8.8 (possibly earlier) will be subject to this.

@sisyphus, thanks for this analysis. I don't claim to have completely grokked it, as I've never written XS code for myself and last worked on EU::PXS 10 or 11 years ago.

Would it be possible for you to write a test for the core distribution (i.e., not involving Inline::C) that if run against blead would currently FAIL but, if EU::PXS were (subsequent to discussion here) modified, would then PASS?

Thank you very much. Jim Keenan

sisyphus commented 2 years ago

@jkeenan, my detailing of the problem is, I think, of fairly poor quality.

The issue occurs solely with this line of code in the process_typemaps() subroutine in ExtUtils/ParseXS/Utilities.pm (at line 275 of that file with EU-ParseXS-3.44):

  push @tm, standard_typemap_locations( \@INC );

The thing is that the contents of code>@tm</code are determined by the Makefile, whose contents are in turn dependent upon the contents of the Makefile.PL. This makes it a little awkward (though, doubtless, not impossible) to come up with the type of test you're requesting.

So I'll first make an attempt at a better, more succinct, explanation. If the Makefile.PL has specified a (user-supplied) typemap that is not detected by default, then that typemap (fully qualifed path) will, prior to the execution of that push(), already be somewhere within code>@tm</code. Couple that with the facts that the list returned by standard_typemap_locations( \@INC ): 1) will begin with ExtUtils/typemap (fully qualified path); && 2) will not include that user-supplied typemap (because it's not in one of the dafault locations);

and we plainly see that following the execution of the push(), code>@tm</code will always have an occurrence of the ExtUtils/typemap somewhere after that single occurrence of the user-supplied typemap. Therefore, the ExtUtils/typemap will, in the event of a clash, always override the user-supplied typemap - even though it's pretty obvious that the user is intending that his typemap overrides the ExtUtils/typemap.

ASIDE: If the user-supplied typemap happens to be found by default, then there's no issue - because it will then also be found after ExtUtils/typemap in the list returned by standard_typemap_locations( \@INC ), thus ensuring that the user-supplied typemap overrides ExtUtils/typemap.

That's about it for the explanation. One of the nasty things about this bug is that there's nothing to indicate the reason that the ExtUtils/typemap is still taking precedence over the user-supplied typemap. On examination of the make output it will be seen that the user-supplied typemap is listed after the ExtUtils/typemap, and should therefore, according to the documentation, be taking precedence.

There's a number of ways to correct the problem.

Firstly, we could replace that "push" with "unshift". I think that works fine, but requires re-working of 600-t-compat.t

Secondly, we could remove the first element of the list returned by standard_typemap_locations( \@INC ) before pushing it onto code>@tm</code. That should work because, I think, that very same element (ie the ExtUtils/typemap) is already at the start of code>@tm</code. However, that also required a a re-working of 600-t-compat.t.

The patch I suggested requires no re-working of the tests, but could still be problematic if I've misunderstood certain aspects.

I'll work on developing a test for inclusion in the test suite.

BTW, the list returned by standard_typemap_locations( \@INC ) is readily accessible from perl;

perl -MExtUtils::ParseXS::Utilities -wle "@std = ExtUtils::ParseXS::Utilities::standard_typemap_locations( \@INC ); print for @std;"
sisyphus commented 2 years ago

Attached is a test script (107-process_typemaps.txt) that can be placed in the ExtUtils-ParseXS/t folder, the extension changed ".t", and the file executed. If run as perl t/107-process_typemaps.t then test 5 will fail. If this patch is first applied to Utilities.pm, then all tests pass when the test suite is run. (At least that's what I'm seeing on Windows and Linux.) Note that the test script creates a file named override.typemap which it writes into the t/data folder. Ideally, the source distro would instead ship with t/data/override.typemap in place.

--- Utilities.pm_orig   2021-12-27 16:40:49 +1100
+++ Utilities.pm    2021-12-29 17:02:13 +1100
@@ -272,7 +272,12 @@
     die "Can't find $typemap in $pwd\n" unless -r $typemap;
   }

-  push @tm, standard_typemap_locations( \@INC );
+  my @s = standard_typemap_locations( \@INC );
+  {
+  no warnings 'uninitialized';
+  shift @s if $s[0] eq $tm[0]; # Remove ExtUtils/typemap from the start of @s.
+  }
+  push @tm, @s;

   require ExtUtils::Typemaps;
   my $typemap = ExtUtils::Typemaps->new;

The "no warnings" stuff was not part of the patch as originally submitted, but without it, some of the other test files do generate warnings, and one test file even fails because warnings are unexpected and not tolerated. This patch is just something that works without breaking other tests. To fix this particular issue, the only requirement is that ExtUtils/typemap does not appear in the list that gets push()ed onto code>@tm</code, and this patch is just one of a number of ways to achieve that.

I guess one other solution to this issue is to leave the code as it currently stands and simply document that if you want to override ExtUtils/typemap settings you must ensure that your customised typemap is found automatically (ie by default). I don't think that such behaviour DWIMs, but at least it would cover us if it's documented.

107-process_typemaps.txt

jkeenan commented 2 years ago

Module: ExtUtils::ParseXS

Description

The assertion made in the title is true only when the typemap provided by the user is NOT found by default, but relies on being found via the TYPEMAPS setting in WriteMakefile().

@Sisyphus, before anyone plunges into this issue ... I'm confused about one thing. The string TYPEMAPS (all upper-case) is not found in dist/ExtUtils-ParseXS, nor is it found in the current (version 3.35) CPAN distribution. In the core distribution, it's only found in cpan/ExtUtils-MakeMaker.

$ ack -l TYPEMAPS .
cpan/ExtUtils-MakeMaker/t/lib/MakeMaker/Test/Setup/XS.pm
cpan/ExtUtils-MakeMaker/lib/ExtUtils/MM_Unix.pm
cpan/ExtUtils-MakeMaker/lib/ExtUtils/MakeMaker.pm

In the core distribution, dist/ExtUtil-ParseXS/Makefile.PL is generated during the build.

$ head dist/ExtUtils-ParseXS/Makefile.PL
#-*- buffer-read-only: t -*-

# This Makefile.PL was written by make_ext.pl.
# It will be deleted automatically by make realclean

And that implies that dist/ExtUtil-ParseXS/Makefile is similarly generated during the build. In the CPAN distribution of ExtUtils-ParseXS, Makefile.PL is hand-rolled (as are META.json, META.yml, etc.).

So I'm unsure as to whether your initial description of the problem applies to dist/ExtUtils-ParseXS in blead, to the CPAN distro, or in some way to both.

Leont commented 2 years ago

Firstly, we could replace that "push" with "unshift".

Yeah, that's also what I was thinking.

Secondly, we could remove the first element of the list returned by standard_typemap_locations( \@INC ) before pushing it onto @tm. That should work because, I think, that very same element (ie the ExtUtils/typemap) is already at the start of @tm.

I don't think we should remove just the first element, but any element that is already user-supplied.

However, that also required a a re-working of 600-t-compat.t.

I think that's entirely acceptable in this particular situation. Frankly this override never seems to have worked usably.

sisyphus commented 2 years ago

OK ... ignoring all duplicates:

--- Utilities.pm_orig   2021-12-27 16:40:49 +1100
+++ Utilities.pm        2022-01-07 00:28:33 +1100
@@ -274,9 +274,16 @@

   push @tm, standard_typemap_locations( \@INC );

+  # Ensure that all paths are absolute so that we can
+  # detect (and ignore) duplicate entries.
+  @tm = map {File::Spec->rel2abs($_)} @tm;
+
   require ExtUtils::Typemaps;
   my $typemap = ExtUtils::Typemaps->new;
+  my %seen;
   foreach my $typemap_loc (@tm) {
+    # ignore duplicates
+    next if $seen{$typemap_loc}++;
     next unless -f $typemap_loc;
     # skip directories, binary files etc.
     warn("Warning: ignoring non-text typemap file '$typemap_loc'\n"), next

Note that code>@tm</code (a reference to which is passed to process_typemaps() as the first argument) can already contain duplicates upon its arrival. That doesn't really matter, but it's probably easier to just remove the duplicate, rather than trying to explain why that removal is unnecessary.

Also, IME with Inline::C and with building XS modules using EU::MM and 'make", code>@tm</code will always contain all of the requisite typemaps (and in the required order) prior to the push(). In those cases, this patch push()es stuff onto code>@tm</code and then does absolutely nothing with that stuff - it either gets ignored because it's a duplicate, or ignored because it specifies a non-existent file.

However, I don't feel confident that my range of experiences is complete. Perhaps, in practice (and as with one of the tests in 600-t-compat.t), there is an actual need to perform that push.

Anyway, that patch tests fine for me on EU::ParseXS-3.44, and seems to be behaving as intended.

What think ye ?

sisyphus commented 2 years ago

On Thu, Jan 6, 2022 at 10:22 AM James E Keenan @.***> wrote:

So I'm unsure as to whether your initial description of the problem applies to dist/ExtUtils-ParseXS in blead, to the CPAN distro, or in some way to both.

It applies solely to recent versions of EU-ParseXS, including the version currently found in blead,

The "Makefile.PL" I was alluding to was the (any) Makefile.PL that ships with the source of a perl extension (ie a module containing C source requiring compilation) that one wants to build, test and install. When the Makefile.PL is executed, the Makefile generated by EU::MM will contain a space-delimited list (XSUBPPDEPS) that consists of (in order): 1) perl's standard ExtUtils/typemap 2) Any and all files specified in the Makefile.PL via the "TYPEMAPS" option 3) Any and all typemap files (except ExtUtils/typemap) that were automatically detected 4) xsubpp

During the 'make' phase of the building of the perl extension, that list of files (minus "xsubpp") gets passed as an array reference to Utilities.pm's process_typemaps() subroutine. The problem lies with what process_typemaps() then does with that array reference. In a nutshell, because of the push() performed inside process_typemaps(), any typemap that was specified by the TYPEMAPS option && was not also automatically detected is prevented from overriding ExtUtils/typemap.