conda-forge / perl-feedstock

A conda-smithy repository for perl.
BSD 3-Clause "New" or "Revised" License
3 stars 31 forks source link

WIP: native Windows build/unifying directory structure #68

Open jvolkening opened 9 months ago

jvolkening commented 9 months ago

Checklist

This is a WIP toward a native Windows build from source (rather than using pre-built Strawberry Perl) with the same directory structure as the *nix builds to enable use of noarch module packages (see e.g. issues #58, #48). The current commit succeeds with conda build on my local Windows box, and the installed perl seems to behave as expected. However, further testing is needed and I'm sure there are additional tweaks needed (assuming it even passes CI when I submit this). Some implementation notes:

  1. Changing the install paths was relatively straightforward; the Configure script does not run on win64 so I patched the GNUMakefile and config.gc scripts directly with the necessary settings.

  2. By far the biggest PITA here was setting @INC correctly due to a seeming perfect storm of incompatibilities:

    1. Perl hardcodes the default search paths in @INC into the shared library (in this version perl532.dll). This appears unavoidable. If these values aren't set correctly to point to at least Config.pm and Config_heavy.pl, nothing except for the simplest calls like perl --version is going to run.
    2. There is a userelocatableinc compile-time option, used by the *nix builds here, that is meant to address this to make portable installs easier. However, nothing I tried was able to get this feature working on Windows. I never figured out why, but the ... special token was never resolved correctly, and was simply stripped to end up with paths relative to the CWD rather than the perl installl location.
    3. conda build deals with these situations by using placeholder paths and doing a string replace with the actual path during installs. Here, it is doing this for the text files (.pl, .pm, .bat) but not binary files. If I explicitly listed the DLL in the recipe under binary_has_prefix_files, it was recognized and listed in info/has_prefix but not actually patched. As far as I can tell from the conda-build docs and source code, this patching is simply not done on Windows binary files.
    4. With the Makefile and config changes above, perl was still not adding the new-style paths (<prefix>/lib/perl5/core_perl, etc) to @INC automatically. I made use of the otherlibdirs compile-time option to add them, but again these needed to be absolute paths; the userelocatableinc stuff never worked here either.
    5. In the end, what worked for me was to implement a conda-build-esque placeholder approach manually, adding a long placeholder string into @INC at compile time, and using a post-link script to replace this in the DLL with the actual paths at install time. NULL-padding the string didn't work as I'd hoped, but eventually I realized that since the string was a semicolon-delimited (on Windows) list of paths, I just padded with semicolons. This appears to worked as I'd hoped, with the additional semicolons simply ignored.
    6. Perl appears to always add one default relative path into @INC automatically, which is the lib directory located relative to the Perl binary at i.e. bin/../lib. I took advantage of this to temporarily copy over the several most important library files during install in order to do the actual DLL patching, and then clean them up afterwards.
    7. This is not exactly an ideal solution, and I would love if someone could find a better way, e.g. actually get userelocatableinc to work on Windows. I've tried a lot of different configurations, though, and this was the only solution that worked.
    8. Actually, there is one other possible solution I contemplated, and that would be to set the environment variable PERL5LIB every time a conda environment with perl installed is activated. This is cleaner in some ways than patching the DLL. In the end, I think it is also more prone to accidental breakage, since a user can easily overwrite this variable in their working environment and then not be able to figure out why perl is competely broken. Really, I don't think this variable should be relied upon for the core perl functionality.

Some other things:

  1. I copied over as many of the patches/tweaks from the *nix build.sh as seemed potentially applicable. One notable exception was the dynamic_config patch and the associated sed replacements. As implemented, they break things on Windows because part of the patch replaces a single-quoted long block of configuration text with double quotes in order to enable interpolation of the $compilerroot variable, but on Windows there are other special characters in this block of text that should not be interpolated. For now I disabled this patch; it's not clear to me whether this will break other things for dependent package builds or if it is not needed on Windows. I tested compilation of some randomly selected perl modules using the built and installed native perl and everything seemed in order, including modules that required compilation. Possibly issues will still arise from this exclusion, though, and a fix to get the patch working will be needed.

  2. At this point, assuming the recipe builds okay on CI, more testing is needed to ensure that dependent recipes can still be built using this new perl build, and that noarch recipes can be installed and will work correctly (i.e. that the directory structures between the *nix and Windows builds are actually unified). Any suggestions on how best to go about this would be welcome. As I mentioned, I've done some basic tests by installing the local package in a conda env and then trying to build some random modules from CPAN, which works, but doesn't test the actual conda build process.

conda-forge-webservices[bot] commented 9 months ago

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

jvolkening commented 9 months ago

Why is CI still building on Linux and OSX, even though I added skip: True # [not win] to the recipe? I thought this would temporarily save some CI resources while wrinkles in the Windows build were ironed out.

EDIT: nevermind, I see that they run but are essentially a no-op

jvolkening commented 9 months ago

One other thing I forgot to mention in my long-winded intro comment: I see that exclusion of . from @INC that was made default in perl v5.26 for security reasons was disabled in the *nix builds, and I copied this behavior into the Windows build. However, maybe this decision is worthy of reconsideration? There was a lot of discussion in the core community on the relative pros and cons of lessening the potential attack surface vs breaking a lot of things on CPAN that rely on this behavior, but eventually it was made default. In conda, maybe it is also better to fix the dependencies which break due to the change. There is a very easy fix of setting PERL_USE_UNSAFE_INC to true during building only, if the only use of the . in @INC is in the test suite (commonly the case).

jvolkening commented 9 months ago

Another note:

I had to relax the Windows tests a bit because PREFIX didn't seem to be set as expected during the test phase. At one point I added a test line that simply printed out all of the environment variables, and PREFIX was listed as PREFIX: %PREFIX%, i.e. the literal value "%PREFIX%" rather than the actual path. I don't know if this was something weird related to my local environment, but I was expecting this to be set correctly by conda-build during testing similarly to a normal install. As it is, I had to relax the tests only to match against the last part of the path. In an actual install, the paths are correct according to perl -V (and the fact that everything seems to work and packages are installed under the expected paths e.g. $CONDA_PREFIX/lib/perl5/vendor_perl/<module>).

jvolkening commented 9 months ago

After some initial testing with building perl-* packages depending on this new perl build, there are some interactions between the version of make used and the build parameters used that need to be understood. Specifically, there are two different versions of GNU make for Windows available in conda-forge, make and m2w64-make. Both appear to be built on top of MinGW-w64. I used m2w64-make in this PR to build perl.

When building Perl modules using this perl, the standard perl Makefile.PL command used by the majority of Perl modules results in a Makefile with either DIRFILESEP = \ or DIRFILESEP = / depending on the make version and whether MAKE= is specified as an extra parameter, e.g. perl Makefile.PL MAKE=make. Only Makefiles where DIRFILESEP is a forward slash result in a successful build, despite the fact that this is Windows. I have observed the following results: make installed? m2w64-make installed? MAKE= DIRFILESEP set to:
N Y make \
N Y mingw32-make /
N Y none /
Y Y make /
Y Y mingw32-make /
Y Y none /
Y N make /
Y N mingw32-make \
Y N none \

In all cases, setting MAKE= to the installed version results in the correct separator set. If no MAKE= is set, mingw32-make must be installed to get the correct behavior, possibly because that is what was used to compile perl itself. If make alone is installed, then MAKE=make must be specified to get the correct behavior.

It would be preferable to be able to use make without specifying MAKE=make, since that is how many/most existing perl-* recipes are written. If I can modify this PR to use make to build perl, that may solve the issue.

EDIT: Just pushed a new commit using plain make and it seems to work just fine.

jvolkening commented 9 months ago

I've started testing cross-compatibility of noarch recipes by building perl-* packages on Linux and installing them on Windows (using this perl build). After addressing the different makes above, the next issue I ran into was with filenames. On Linux there are manpages built for the modules that can contain colons in the filename (e.g. man/man3/YAML::Tiny.3). On Windows, the colon is a reserved character and the colons in the filename have been converted to underscores. As a result, I get errors similar to the following during install:

CondaVerificationError: The package for perl-yaml-tiny located at
C:\Users\jdvol\miniforge3\pkgs\perl-yaml-tiny-1.74-pl0 appears to be corrupted.
The path 'man/man3/YAML::Tiny.3' specified in the package manifest cannot be found.

I confirmed that there is a file man/man3/YAML__Tiny.3. Any ideas on how to deal with this within conda that won't involve changing all of the existing recipes?

jvolkening commented 9 months ago

One possible solution to the specific file naming issue above would be to disable generation of manpages globally by patching ExtUtils::MakeMaker in the core source to remove the manifypod target. It wouldn't surprise me if this was the only place where the colons-in-filenames issue arose. This would also address the question raised in #52 of whether to skip manpage generation for the sake of file space. After all, in Perl packages manpages are just translations of the native POD documentation, which is always available using perldoc.

jvolkening commented 9 months ago

I disabled building of manpages on nix by changing man1dir and man3dir to empty values in build.sh. They aren't built on Windows anyway. After rebuilding locally for both Linux and Windows, I am able to build `perl-`noarch packages on Linux, install them directly on Windows, and run the module's unit tests successfully within the freshly created environment. I have done this for about a dozen perl-* recipes with no failures, and am currently batch testing the full noarch set of recipes currently in conda-forge.

I would say, pending review, that this is nearing or arrived at the desire state of having Linux, OSX, and Windows builds that share a common directory structure. Any advice on how to further test is welcome.

Disabling manpages breaks recipes which use e.g. man/man1/perlartistic.1 as the license file; these recipes would need to be changed. I question this practice of using these troff/groff files as license text anyway, since they are not really human-readable.

jvolkening commented 9 months ago

The ppc64le build failed when I re-enabled the unit tests. It looks like only one test failed (utime.t in Time::HiRes). I have no way to debug this...I guess these tests can be disabled again in a future commit.