Open GaetanLepage opened 11 months ago
I can't reproduce locally, and our CI seems to be working. Can you check the output of python -c "import pytensor; print(pytensor.config)"
and maybe also the one you were getting with the previous version? (The diff should be enough)
Here is what I get:
How does it compare with before it started failing?
Sorry for the delay ! Here is the diff:
diff --git a/old.txt b/new.txt
index 624402e..325c0df 100644
--- a/old.txt
+++ b/new.txt
@@ -1,3 +1,4 @@
+WARNING (pytensor.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
floatX ({'float16', 'float32', 'float64'})
Doc: Default floating-point precision for python casts.
@@ -8,7 +9,7 @@ warn_float64 ({'pdb', 'ignore', 'warn', 'raise'})
Doc: Do an action when a tensor variable with float64 dtype is created.
Value: ignore
-pickle_test_value (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6f8e3290>>)
+pickle_test_value (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff747bc290>>)
Doc: Dump test values while pickling model. If True, test values will be dumped with model.
Value: True
@@ -24,15 +25,15 @@ device (cpu)
Doc: Default device for computations. only cpu is supported for now
Value: cpu
-force_device (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffff73e9150>>)
+force_device (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff73dcfb90>>)
Doc: Raise an error if we can't use the specified device
Value: False
-conv__assert_shape (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6f701090>>)
+conv__assert_shape (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff7394e650>>)
Doc: If True, AbstractConv* ops will verify that user-provided shapes match the runtime shapes (debugging option, may slow down compilation)
Value: False
-print_global_stats (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6e8fed90>>)
+print_global_stats (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff749d56d0>>)
Doc: Print some global statistics (time spent) at the end
Value: False
@@ -40,23 +41,23 @@ assert_no_cpu_op ({'pdb', 'ignore', 'warn', 'raise'})
Doc: Raise an error/warning if there is a CPU op in the computational graph.
Value: ignore
-unpickle_function (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6e8fee50>>)
+unpickle_function (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff7394e850>>)
Doc: Replace unpickled PyTensor functions with None. This is useful to unpickle old graphs that pickled them when it shouldn't
Value: True
-<pytensor.configparser.ConfigParam object at 0x7ffe6e8fef50>
+<pytensor.configparser.ConfigParam object at 0x7fff7394e8d0>
Doc: Default compilation mode
Value: Mode
cxx (<class 'str'>)
Doc: The C++ compiler to use. Currently only g++ is supported, but supporting additional compilers should not be too difficult. If it is empty, no C++ code is compiled.
- Value: /nix/store/zlzz2z48s7ry0hkl55xiqp5a73b4mzrg-gcc-wrapper-12.3.0/bin/g++
+ Value: /nix/store/90h6k8ylkgn81k10190v5c9ldyjpzgl9-gcc-wrapper-12.3.0/bin/g++
linker ({'c|py', 'c|py_nogc', 'cvm_nogc', 'c', 'vm', 'cvm', 'py', 'vm_nogc'})
Doc: Default linker used if the pytensor flags mode is Mode
Value: cvm
-allow_gc (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6e8ff250>>)
+allow_gc (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff7394e990>>)
Doc: Do we default to delete intermediate results during PyTensor function calls? Doing so lowers the memory requirement, but asks that we reallocate memory at the next function call. This is implemented for the default linker, but may not work for all linkers.
Value: True
@@ -64,7 +65,7 @@ optimizer ({'o3', 'fast_run', 'fast_compile', 'unsafe', 'o4', 'o2', 'merge', 'No
Doc: Default optimizer. If not None, will use this optimizer with the Mode
Value: o4
-optimizer_verbose (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6edb2f10>>)
+optimizer_verbose (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff7456ffd0>>)
Doc: If True, we print all optimization being applied
Value: False
@@ -72,7 +73,7 @@ on_opt_error ({'raise', 'warn', 'ignore', 'pdb'})
Doc: What to do when an optimization crashes: warn and skip it, raise the exception, or fall into the pdb debugger.
Value: warn
-nocleanup (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6e8ff290>>)
+nocleanup (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff7394e9d0>>)
Doc: Suppress the deletion of code files that did not compile cleanly
Value: False
@@ -84,19 +85,19 @@ gcc__cxxflags (<class 'str'>)
Doc: Extra compiler flags for gcc
Value:
-cmodule__warn_no_version (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6e8ff390>>)
+cmodule__warn_no_version (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff7394eed0>>)
Doc: If True, will print a warning when compiling one or more Op with C code that can't be cached because there is no c_code_cache_version() function associated to at least one of those Ops.
Value: False
-cmodule__remove_gxx_opt (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6e8ff4d0>>)
+cmodule__remove_gxx_opt (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff7394ee90>>)
Doc: If True, will remove the -O* parameter passed to g++.This is useful to debug in gdb modules compiled by PyTensor.The parameter -g is passed by default to g++
Value: False
-cmodule__compilation_warning (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6e8ff590>>)
+cmodule__compilation_warning (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff73dcfe90>>)
Doc: If True, will print compilation warnings.
Value: False
-cmodule__preload_cache (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffff71c3190>>)
+cmodule__preload_cache (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff7394f090>>)
Doc: If set to True, will preload the C module cache at import time
Value: False
@@ -104,7 +105,7 @@ cmodule__age_thresh_use (<class 'int'>)
Doc: In seconds. The time after which PyTensor won't reuse a compile c module.
Value: 2073600
-cmodule__debug (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6f6db850>>)
+cmodule__debug (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff7394f1d0>>)
Doc: If True, define a DEBUG macro (if not exists) for any compiled C code.
Value: False
@@ -128,7 +129,7 @@ tensor__cmp_sloppy (<class 'int'>)
Doc: Relax pytensor.tensor.math._allclose (0) not at all, (1) a bit, (2) more
Value: 0
-lib__amblibm (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6e8ff790>>)
+lib__amblibm (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff7479ef90>>)
Doc: Use amd's amdlibm numerical library
Value: False
@@ -155,7 +156,7 @@ exception_verbosity ({'high', 'low'})
C. log_likelihood_h
Value: low
-print_test_value (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6e8ffcd0>>)
+print_test_value (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff7394f610>>)
Doc: If 'True', the __eval__ of an PyTensor variable will return its test_value when this is available. This has the practical consequence that, e.g., in debugging `my_var` will print the same as `my_var.tag.test_value` when a test value is defined.
Value: False
@@ -167,19 +168,19 @@ compute_test_value_opt ({'off', 'ignore', 'warn', 'raise', 'pdb'})
Doc: For debugging PyTensor optimization only. Same as compute_test_value, but is used during PyTensor optimization
Value: off
-check_input (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6e8ffdd0>>)
+check_input (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff7394f750>>)
Doc: Specify if types should check their input in their C code. It can be used to speed up compilation, reduce overhead (particularly for scalars) and reduce the number of generated C files.
Value: True
-NanGuardMode__nan_is_error (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6e8fff10>>)
+NanGuardMode__nan_is_error (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff7394f850>>)
Doc: Default value for nan_is_error
Value: True
-NanGuardMode__inf_is_error (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6eae4090>>)
+NanGuardMode__inf_is_error (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff7394f990>>)
Doc: Default value for inf_is_error
Value: True
-NanGuardMode__big_is_error (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6eae4110>>)
+NanGuardMode__big_is_error (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff7394fa10>>)
Doc: Default value for big_is_error
Value: True
@@ -191,15 +192,15 @@ DebugMode__patience (<class 'int'>)
Doc: Optimize graph this many times to detect inconsistency
Value: 10
-DebugMode__check_c (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6f75b0d0>>)
+DebugMode__check_c (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff7394fc50>>)
Doc: Run C implementations where possible
Value: True
-DebugMode__check_py (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6eae4350>>)
+DebugMode__check_py (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff74857ed0>>)
Doc: Run Python implementations where possible
Value: True
-DebugMode__check_finite (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffff76139d0>>)
+DebugMode__check_finite (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff7394fb10>>)
Doc: True -> complain about NaN/Inf results
Value: True
@@ -207,7 +208,7 @@ DebugMode__check_strides (<class 'int'>)
Doc: Check that Python- and C-produced ndarrays have same strides. On difference: (0) - ignore, (1) warn, or (2) raise error
Value: 0
-DebugMode__warn_input_not_reused (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6f75b990>>)
+DebugMode__warn_input_not_reused (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff7394fdd0>>)
Doc: Generate a warning when destroy_map or view_map says that an op works inplace, but the op did not reuse the input for its output.
Value: True
@@ -219,7 +220,7 @@ DebugMode__check_preallocated_output_ndim (<class 'int'>)
Doc: When testing with "strided" preallocated output memory, test all combinations of strides over that number of (inner-most) dimensions. You may want to reduce that number to reduce memory or time usage, but it is advised to keep a minimum of 2.
Value: 4
-profiling__time_thunks (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6eae4450>>)
+profiling__time_thunks (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff7394fed0>>)
Doc: Time individual thunks when profiling
Value: True
@@ -240,7 +241,7 @@ profiling__min_memory_size (<class 'int'>)
of their outputs (in bytes) is lower than this threshold
Value: 1024
-profiling__min_peak_memory (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6eae4610>>)
+profiling__min_peak_memory (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff7394ff10>>)
Doc: The min peak memory usage of the order
Value: False
@@ -248,11 +249,11 @@ profiling__destination (<class 'str'>)
Doc: File destination of the profiling output
Value: stderr
-profiling__debugprint (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6f8e1c90>>)
+profiling__debugprint (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff7394fd50>>)
Doc: Do a debugprint of the profiled functions
Value: False
-profiling__ignore_first_call (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6eae47d0>>)
+profiling__ignore_first_call (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff7393c050>>)
Doc: Do we ignore the first call of an PyTensor function.
Value: False
@@ -260,7 +261,7 @@ on_shape_error ({'raise', 'warn'})
Doc: warn: print a warning and use the default value. raise: raise an error
Value: warn
-openmp (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffff73bb910>>)
+openmp (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff7479d0d0>>)
Doc: Allow (or not) parallel computation on the CPU with OpenMP. This is the default value used when creating an Op that supports OpenMP parallelization. It is preferable to define it via the PyTensor configuration file ~/.pytensorrc or with the environment variable PYTENSOR_FLAGS. Parallelization is only done for some operations that implement it, and even for operations that implement parallelism, each operation is free to respect this flag or not. You can control the number of threads used with the environment variable OMP_NUM_THREADS. If it is set to 1, we disable openmp in PyTensor by default.
Value: False
@@ -312,23 +313,23 @@ unittests__rseed (<class 'str'>)
Doc: Seed to use for randomized unit tests. Special value 'random' means using a seed of None.
Value: 666
-warn__round (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6eae4bd0>>)
+warn__round (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff7393c450>>)
Doc: Warn when using `tensor.round` with the default mode. Round changed its default from `half_away_from_zero` to `half_to_even` to have the same default as NumPy.
Value: False
-profile (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6eae4c50>>)
+profile (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff7393c590>>)
Doc: If VM should collect profile information
Value: False
-profile_optimizer (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6eae4c90>>)
+profile_optimizer (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff7393c390>>)
Doc: If VM should collect optimizer profile information
Value: False
-profile_memory (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6eda86d0>>)
+profile_memory (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff7393c5d0>>)
Doc: If VM should collect memory profile information and print it
Value: False
-<pytensor.configparser.ConfigParam object at 0x7ffe6eae4dd0>
+<pytensor.configparser.ConfigParam object at 0x7fff73e070d0>
Doc: Useful only for the VM Linkers. When lazy is None, auto detect if lazy evaluation is needed and use the appropriate version. If the C loop isn't being used and lazy is True, use the Stack VM; otherwise, use the Loop VM.
Value: None
@@ -336,11 +337,11 @@ numba__vectorize_target ({'cuda', 'parallel', 'cpu'})
Doc: Default target for numba.vectorize.
Value: cpu
-numba__fastmath (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6f8e23d0>>)
+numba__fastmath (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff7393c790>>)
Doc: If True, use Numba's fastmath mode.
Value: True
-numba__cache (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6eae4ed0>>)
+numba__cache (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff7393c6d0>>)
Doc: If True, use Numba's file based caching.
Value: True
@@ -353,27 +354,27 @@ Defaults to compiledir_%(short_platform)s-%(processor)s-
%(python_version)s-%(python_bitwidth)s.
Value: compiledir_%(short_platform)s-%(processor)s-%(python_version)s-%(python_bitwidth)s
-<pytensor.configparser.ConfigParam object at 0x7ffe6f9b6850>
+<pytensor.configparser.ConfigParam object at 0x7fff74af4690>
Doc: platform-independent root directory for compiled modules
- Value: /build/tmp.41UnBoqk62/.pytensor
+ Value: /build/tmp.gHcmWT784l/.pytensor
-<pytensor.configparser.ConfigParam object at 0x7ffe6f94a390>
+<pytensor.configparser.ConfigParam object at 0x7fff7456fb90>
Doc: platform-dependent cache directory for compiled modules
- Value: /build/tmp.41UnBoqk62/.pytensor/compiledir_Linux-6.5--generic-x86_64-with-glibc2.38--3.11.5-64
+ Value: /build/tmp.gHcmWT784l/.pytensor/compiledir_Linux-6.1.62-x86_64-with-glibc2.38--3.11.6-64
blas__ldflags (<class 'str'>)
Doc: lib[s] to include for [Fortran] level-3 blas implementation
- Value: -L/nix/store/fcfbp2iphh271h19m3g0hi5q0x20l8vv-lapack-3/lib -L/nix/store/29jz2fshxr7lf0ny4hb3q3z0jqpqmfz7-blas-3/lib -llapack -llapacke -lblas -lcblas -llapack -llapacke -lblas -lcblas
+ Value:
-blas__check_openmp (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6e700e90>>)
+blas__check_openmp (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7fff73751090>>)
Doc: Check for openmp library conflict.
WARNING: Setting this to False leaves you open to wrong results in blas-related operations.
Value: True
-scan__allow_gc (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6b0058d0>>)
+scan__allow_gc (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffeee3d38d0>>)
Doc: Allow/disallow gc inside of Scan (default: False)
Value: False
-scan__allow_output_prealloc (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffe6b3d9610>>)
+scan__allow_output_prealloc (<bound method BoolParam._apply of <pytensor.configparser.BoolParam object at 0x7ffeee975590>>)
Doc: Allow/disallow memory preallocation for outputs inside of scan (default: True)
Value: True
new.txt
is the failing 2.18.0 build.old.txt
is the 2.17.0 working one.Thanks! The diff is a bit verbose because of the memory locations, but the critical changes seems to be here:
+WARNING (pytensor.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
...
blas__ldflags (<class 'str'>)
Doc: lib[s] to include for [Fortran] level-3 blas implementation
- Value: -L/nix/store/fcfbp2iphh271h19m3g0hi5q0x20l8vv-lapack-3/lib -L/nix/store/29jz2fshxr7lf0ny4hb3q3z0jqpqmfz7-blas-3/lib -llapack -llapacke -lblas -lcblas -llapack -llapacke -lblas -lcblas
+ Value:
This is most likely caused by https://github.com/pymc-devs/pytensor/pull/444
CC @lucianopaz
Oh, thanks for pointing this out. Do you recommend us to change our build flags then or is is something you plan to "fix" ?
I think we should fix it. We didn't want the behavior to change for users
I think we should fix it. We didn't want the behavior to change for users
Ok, understood. We will then wait for the next release to update. Thanks for your help :)
I’m confused about the two failing tests. The first says that we expect a C implementation of GeMV but we get a python implementation instead? Or the other way around? Does this mean that the lapack link flags are wrong? If that’s the case, should we remove the lapack blas flags from the check our should we fix lapack? The second failure I don’t know
The first says that we expect a C implementation of GeMV but we get a python implementation instead
The test was expecting a CGEMV and gets a GEMV because PyTensor is not finding the blas/lapack (I don't know the difference) flags in the newer version (so the rewrite that introduces the C version doesn't get triggered). @GaetanLepage showed that indeed the new flags are empty and it now gets the usual using numpy impl warning.
I think the second test failure comes from the same source. It is supposed to work if some blas Ops get inserted but they are not.
This might mean that lapack blas link flags are wrong. These are an addition brought in by #444. The quickest patch would be to comment out the lapack condition from the default blas flags function
@GaetanLepage, does it also break if you set the blas flags to -l blas
?
@GaetanLepage, does it also break if you set the blas flags to
-l blas
?
When I install pytensor
you mean ?
@ferrine, did you have access to a nixOS machine to try to delve into this issue?
@GaetanLepage, I've got a patch that seems to be working over at #517. It should fix the two failing tests that you reported. It turns out that they were both caused by empty blas__ldflags
, but in reality, these tests should be able to run regardless of the blas flags.
Your problem seems to run a bit deeper though. I had understood that in the working version you used to have an empty blas__ldflags value but it looks like I misread the diff statement. Your diff says that in the old version you had
blas__ldflags = -L/nix/store/fcfbp2iphh271h19m3g0hi5q0x20l8vv-lapack-3/lib -L/nix/store/29jz2fshxr7lf0ny4hb3q3z0jqpqmfz7-blas-3/lib -llapack -llapacke -lblas -lcblas -llapack -llapacke -lblas -lcblas
But the new pytensor
version has it empty, right? The crux of the matter is to learn how these blas flags come about. To help me reason about this, I wanted to ask you some questions.
libraries
returned by /nix/store/zlzz2z48s7ry0hkl55xiqp5a73b4mzrg-gcc-wrapper-12.3.0/bin/g++ -print-search-dirs
? If they aren't, then the compiler doesn't know about them unless it is explicitly told about them.@GaetanLepage, we fixed the tests that were failing as a side effect of having empty blas flags, so the nixOS build should work now. The problem is that, when pytensor
is imported, it will try to find the blas libraries in the default search directories of the compiler or of the python library directory. The flags that you were getting before were provided by numpy
as a side-effect of saying that you wanted to compile it with lapack and blas and the fact that numpy used to store that build information in a numpy.distutils
property. That property was removed in python 3.12 and we changed the logic for blas detection.
I'm not familiar with nixOS so I'll try to ask a few questions to see if we can improve the user experience for pytensor there. Neither BLAS nor Lapack are build-time dependencies for pytensor. They are only used at runtime, when an Op needs to compile to C code and then get linked to one of those libraries. Once nix installs BLAS and Lapack, can we know where to look for them in the file-system at runtime? If yes, we could apply a similar patch to #517, but keep it nixOS specific, so that BLAS and Lapack are searched in some nix default place. That way, nix users would get to use pytensor
with some blas flags without having to do any kind of manual override of the .pytensorrc
or environment variables.
@lucianopaz thank you for taking the time to fix this ! I can confirm that everything runs fine (for building at least) with the latest release: https://github.com/NixOS/nixpkgs/pull/267030. I know that we can add runtime paths to the derivations we do. As for how we should handle this specific case, I am afraid that I lack a bit of experience to answer. @SomeoneSerge or @mweinelt will probably know better.
Hi! Is my understanding correct that,
In that case we could just pass pytensor one. Is toolchain a required or an optional dependency?
I see that config.cxx
is determined at runtime based on PATH
: https://github.com/pymc-devs/pytensor/blob/7ecb9f8c6b6a2eac940947bac955a10785240667/pytensor/configdefaults.py#L397-L454. We could prepare a wrapped g++ for pytensor already at build time, such that the compiler used by pytensor wouldn't leak into the users' PATH
s (ugly baseline: we can patch configdefaults.py
). Do you think that'd make sense?
RE: using -print-search-dirs
to configure BLAS https://github.com/pymc-devs/pytensor/blob/7ecb9f8c6b6a2eac940947bac955a10785240667/pytensor/link/c/cmodule.py#L2724-L2726
I think this works for us too, but what do you think about using e.g. pkg-config
? That's a more public interface, and it's also more "override-friendly" (users and distributions can adjust the generated flags as appropriate to their environments). For instance, the whole -L... -l...
line can be generated by running pkg-config --libs blas
:
❯ nix-shell -p blas -p lapack -p pkg-config
❯ pkg-config --libs blas lapack
-L/nix/store/29jz2fshxr7lf0ny4hb3q3z0jqpqmfz7-blas-3/lib -L/nix/store/fcfbp2iphh271h19m3g0hi5q0x20l8vv-lapack-3/lib -lblas -llapack
And it prints out sensible errors:
❯ nix-shell -p pkg-config
❯ pkg-config --libs blas
Package blas was not found in the pkg-config search path.
Perhaps you should add the directory containing `blas.pc'
to the PKG_CONFIG_PATH environment variable
No package 'blas' found
Once nix installs BLAS and Lapack, can we know where to look for them in the file-system at runtime?
Sure, you can even predict that location before the build happens:)
Thanks!
Hi @SomeoneSerge. Thanks so much for your detailed reply! I'm sorry that it took me so long to answer.
Hi! Is my understanding correct that,
* The "C Ops" offer a JIT compilation functionality and are part of pytensor's public interface, * One of the runtime dependencies for pytensor (or the Ops part) is a working toolchain for the host platform, also configured to know how locate BLAS and Lapack?
Yes, the Ops get are simply symbolic computations. pytensor
does some rewrites or optimizations on the computational graph and then transpiles the operations into some backend. C one of these "backends". The final executables are produced on the host platform by compiling the C extensions using some C compiler. At this time, the host also must have the libraries that are needed to successfully link the extensions (e.g. blas, mkl, lapack). If pytensor
can't find these libraries, it wont attempt to use them at the expense of potential performance degradation.
In that case we could just pass pytensor one. Is toolchain a required or an optional dependency?
I see that
config.cxx
is determined at runtime based onPATH
:https://github.com/pymc-devs/pytensor/blob/7ecb9f8c6b6a2eac940947bac955a10785240667/pytensor/configdefaults.py#L397-L454 . We could prepare a wrapped g++ for pytensor already at build time, such that the compiler used by pytensor wouldn't leak into the users'
PATH
s (ugly baseline: we can patchconfigdefaults.py
). Do you think that'd make sense?
I'm don't really know nix so I'm a bit lost with how much of this work should go into the nix build and how much of this should go into pytensor
refactors. Are you suggesting that we make some changes to the configdefaults.py
script in order to be able to configure it at build time? If that's what you're suggesting, I think that's a very interesting solution but I'll need to investigate how it could be implemented.
RE: using
-print-search-dirs
to configure BLASI think this works for us too, but what do you think about using e.g.
pkg-config
? That's a more public interface, and it's also more "override-friendly" (users and distributions can adjust the generated flags as appropriate to their environments). For instance, the whole-L... -l...
line can be generated by runningpkg-config --libs blas
:❯ nix-shell -p blas -p lapack -p pkg-config ❯ pkg-config --libs blas lapack -L/nix/store/29jz2fshxr7lf0ny4hb3q3z0jqpqmfz7-blas-3/lib -L/nix/store/fcfbp2iphh271h19m3g0hi5q0x20l8vv-lapack-3/lib -lblas -llapack
And it prints out sensible errors:
❯ nix-shell -p pkg-config ❯ pkg-config --libs blas Package blas was not found in the pkg-config search path. Perhaps you should add the directory containing `blas.pc' to the PKG_CONFIG_PATH environment variable No package 'blas' found
This seems related to build time configuration that I was asking above. I think that it would be awesome to have, but it looks like that pep is still a draft. Again, I'm a bit lost with how much of this should happen at the pytensor
level and how much should happen on nix.
Just to be sure that something actually needs to be changed in pytensor
, I wanted to let you know the mechanism that is already in place to configure pytensor
. The config values like cxx
and blas__ldflags
can be read from a .pytensorrc
file that by default is searched for in the home directory. One could potentially add such a file when installing the package with build time configuration values such as cxx
, blas__flags
, and any other platform specific thing. I'm not sure if that is what you actually want to do when installing pytensor
on nix, but it would be the simplest way to get it to work out of the box.
Once again, thanks a lot for all of the information in your comment. Let me know if you what your thoughts are on the pkg-config stuff I mentioned above.
Describe the issue:
The following tests fail with
pytensor==2.18.0
:Reproducable code example:
Error message:
PyTensor version information:
Version 2.18.0
Context for the issue:
I am working on updating pytensor from 2.17.3 to 2.18.0 on nixpkgs.