Open KJ7LNW opened 2 years ago
Please review and send it through CI. Its a big change but thats what is needed for broad compatibility. Suggestions welcome...
-Eric
Thank you for what is obviously a lot of work! I believe there are some issues (as shown below), but they can be fixed (probably by reducing the scope of this change).
As you can see, the CI has already run. If you add a configure dep, you need to add it to the CI as noted on #15. If you have minor improvements (like updating the needed version of Devel::CheckLib (DCL) for argc
) that should be in a separate, more-easily-reviewable PR, to reduce the scope of this one. I will need persuading that going through the libs flags list is a good idea.
Regarding this one, the fact that you are adding 400 lines is a red flag. Copy-pasting code out of DCL is a sign that a mistake has been made. I am not yet convinced that the partitioning of the lib lines is needed at all. The way you are doing produces (copied from your message above):
-Llibgfortran.a [...]
which is another red flag, because that is nonsensical. Also, I do not consider it valuable to allow control for this module to pick its LAPACK implementation. I'm also not at all a fan of loud configuration, so that needs removing. These days most users won't even see the output of the configuration as they'll use e.g. cpanm
. The env var to debug is a great idea, and should be in the actual docs, probably with an early =head1 MODULE CONFIGURATION
.
Using a regex to detect whether something is an absolute path is an error and instead File::Spec should be used (however, I have yet to be convinced that the approach that requires this is even a good idea).
Finally, you've shown results for Debian derivatives, but not BSD, nor RedHat derivatives - that will need rectifying (VirtualBox or VMware can help). http://matrix.cpantesters.org/?dist=PDL-LinearAlgebra+0.35 shows pass:fail of 60:44 for the latest, and 59:46 for 0.34. That ratio is shown on https://www.cpantesters.org/distro/P/PDL-LinearAlgebra.html?oncpan=1&distmat=1&version=0.35&grade=2 (on the left) to have been there for a number of versions, so it's not a catastrophe, but I am sure we can do better.
At this point I need to be done working on this to focus on other projects.
However, I'll comment to help others who might pick it up, and I can push minor changes based on your findings if someone else can drive additional testing:
Thank you for what is obviously a lot of work!
You're welcome :)
I believe there are some issues (as shown below), but they can be fixed (probably by reducing the scope of this change).
As you'll see below, there isn't much that can be trimmed unless you want to loose support for something. This was quite the interesting and unexpectedly complex investigation:
As you can see, the CI has already run. If you add a configure dep, you need to add it to the CI as noted on #15. If you have minor improvements (like updating the needed version of Devel::CheckLib (DCL) for
argc
) that should be in a separate, more-easily-reviewable PR, to reduce the scope of this one. I will need persuading that going through the libs flags list is a good idea.
Thanks for the note. Pushed. Looks like OSX still has a problem and I would need to see the output of PDL_LA_DEBUG=1 perl Makefile.PL
to debug it, but I don't see how to do that in the CI output. If you can tell me how then I'll push another CI if the fix is minor.
Regarding this one, the fact that you are adding 400 lines is a red flag.
Maybe. Its only as big as I found that it needed to be. Just be glad I didn't use Term::ANSIColor !
Copy-pasting code out of DCL is a sign that a mistake has been made.
Hmm? I didn't copy-paste anything from another package! How un-perl that would be, indeed. This is original code... Devel::CheckLib is used as-is, I just made a wrapper to work around its decade-old bugs. See the comment above new_check_lib()
and also RT#60176, RT#75803, and RT#109028.
I am not yet convinced that the partitioning of the lib lines is needed at all.
Ultimately it is because of RT#60176 and RT#75803 (and debian library naming issues). If those were fixed then maybe we wouldn't need to. See comments in the patch.
The way you are doing produces (copied from your message above):
-Llibgfortran.a [...]
which is another red flag, because that is nonsensical.
I thought that too, but shrugged because it was out of my control. It is produced by ExtUtils::F77
as you can see here:
]$ perl -MExtUtils::F77 -le 'print ExtUtils::F77->runtime'
Loaded ExtUtils::F77 version 1.26
"-Llibgfortran.a" -L/usr/lib -lgfortran -lm
Also, I do not consider it valuable to allow control for this module to pick its LAPACK implementation.
Well, humbly, and for the benefit of PDL's crown jewel, I disagree. Users wanting P::LA are probably interested in performance. If they are, then choice is valuable for a number of reasons that I learned while adding LAPACK to xnec2c: OpenBLAS, ATLAS, and Intel MKL each have different threading models and distributions package them very differently. In addition, different threading models perform differently on different workloads---and that is a per-user decision. (We were able to get all threading models in all 3 of those LAPACK implementations working for xnec2c on Ubuntu, Debian, and Red Hat.)
Here is what I learned:
There are wide differences between threading models even in the same LAPACK implementation across distributions. For example, Debian/Ubuntu uses alternatives
whereas RHEL splits out threading models by name: openblaso for OpenMP, openblasp for pthreads, etc, but defaults to just serial "openblas" which may not be ideal from a performance point of view. The only way to get OpenBLAS+OpenMP in RHEL to select it somehow. (Intel MKL is even more weird, don't get me started on that! ... but Intel MKL is probably the fastest LAPACK implementation out there!)
These days most users won't even see the output of the configuration as they'll use e.g.
cpanm
. The env var to debug is a great idea, and should be in the actual docs, probably with an early=head1 MODULE CONFIGURATION
.
Good ideas.
I'm also not at all a fan of loud configuration, so that needs removing.
As you wish. Though it might be useful for debugging different deployments. I was hoping to see that output in CI, but not sure where it went?
Using a regex to detect whether something is an absolute path is an error and instead File::Spec should be used
Maybe so.
(however, I have yet to be convinced that the approach that requires this is even a good idea).
As above, splitting out -l<lib>
and -L<dir>
and /other/stuff
was necesary to work around Devel::CheckLib bugs. In addition, Ubuntu/Debian don't always create a <lib>.so
file in path so -l<lib>
doesn't even work. Thus the -L<dir>
's need to be searched for <lib>.so.<ver>
and the /path/to/<lib>.so.<version>
is placed in LDFLAGS so it will build.
Most of the time was spent fighting with Devel::CheckLib. Believe me, if there is a better way that actually works with a broad set of distributions, then I'm all for it---but after what I've found here I doubt it.
Finally, you've shown results for Debian derivatives,
yep
but not BSD,
True, and I won't. Don't have the means, time, nor interest to poke at BSD. However, if others want to provide build logs then I'll make minor changes to get it in working order.
nor RedHat derivatives
Actually that is not so. We're an almost 100% redhat shop and use Oracle/CentOS for pretty much everything: See the Oracle Linux 8 section under the validation section. Its the same as RHEL 8 and has the best wide-spread LAPACK support. Every LAPACK implementation and threading model is detected and selectable, and all of them pass make test
.
- that will need rectifying (VirtualBox or VMware can help).
I used nspawn for the ubu/debian tests above
http://matrix.cpantesters.org/?dist=PDL-LinearAlgebra+0.35 shows pass:fail of 60:44 for the latest, and 59:46 for 0.34. That ratio is shown on https://www.cpantesters.org/distro/P/PDL-LinearAlgebra.html?oncpan=1&distmat=1&version=0.35&grade=2 (on the left) to have been there for a number of versions, so it's not a catastrophe, but I am sure we can do better.
Maybe so. #15 is a serious bug that will take some noteworthy engineering to get it working across so many distributions.
There is a possibility that adding libnvblas.so
and libnvblas10.so
to the library search in this PR could enable NVBLAS GPU support. However, I don't have the hardware to test that...would be interesting if someone has nVidia hardware to test NVBLAS in PDL::LA.
...also, if someone can get me PDL_LA_DEBUG=1 perl Makefile.PL
output for OSX then maybe that can be fixed.
Unified @pkgs list into @libsets in find_libs() to group libraries by dependency.
Compatibility with the 0.33 release is important across multiple UNIX deployments. The first section discusses how the new code is equivalent to the old code in terms of functionality, while extending the detection to make actual FORTRAN calls to test that the libraries work. In addition, libraries with multiple dependencies are tested together instead and independently, instead of only independently as previously designed.
This commit is documented in sections
ORIGINAL IMPLEMENTATION
The previous code was as follows. The indented lines are original code, unindented lines are commentary. There are 3 major parts to the original library detection mechanism:
So we can distill the above code down to two items:
pkg-config
cannot find them.NEW DESIGN
The new design intends to satisfy the following:
-l<lib>
and-L<dir>
linker options are treated the same. While there may be special cases for libraries (openblas vs atlas), there is no special-case for UNIX-like operating systems.BUGS IN Devel::CheckLib
(They are commented in the code as well, so pasted here for good visibility)
In addition, Devel::CheckLib fails to provide
argc
to main() before about v1.11, so a Devel::CheckLib version requirement was added.IMPLEMENTATION
We have written a wrapper
new_check_lib()
around Devel::CheckLib::check_lib() to work around the issues above. This is the comment in the code:Configurability with environment variables:
PDL_LA_LDFLAGS='-L<dir> -l<lib>...':
The user can set this to specify their own linker flags (for example, to link with the Intel MKL library).PDL_LA_DEBUG=1: Pass
debug=>1
to Devel::CheckLib::check_lib()PDL_LA_LIB_INDEX=<n>: Select a LAPACK/BLAS library.
When Makefile.PL runs, it emits output like the following. The user can specify
PDL_LA_LIB_INDEX=<n>
to choose the implementation that succeeded. This defaults to 0 if not specified:Safety:
-l<lib>
nor-L<dir>
are passed verbatim.shell_quote
andshellwords
are used to properly parse and escape the input and output.The library detection code at the top of
Makefile.PL
is now a single line:The list of available library combinations are in
@libsets
atop offind_libs()
:Notable changes:
FORTRAN check function:
dgebal_
existenceNotes: The windows library detection was not modified in this patch. The 'shellwords' work-around was not changed, but the diff looks that way
VALIDATION
Oracle Linux 8
The following library combinations pass
make test
on my Oracle Linux 8 deployment:Ubuntu / Debian
In addition, the following Ubuntu/Debian distributions have been tested, all you need to install is
libopenblas-dev
to build LAPACK+BLAS successfully: