supranational / blst

Multilingual BLS12-381 signature library
Apache License 2.0
467 stars 177 forks source link

Integration to `conda-forge` - issue with mac build #231

Closed MementoRC closed 1 month ago

MementoRC commented 1 month ago

HI, I am finalizing the recipe to integrate blst to the conda framework via conda-forge. I am encountering an odd issue when testing the python binding on macos/x86_64:

testing...
Traceback (most recent call last):
  File "/Users/runner/Miniforge3/conda-bld/blst-split_1725117054481/work/bindings/python/./run.me", line 73, in <module>
    import blst
  File "/Users/runner/Miniforge3/conda-bld/blst-split_1725117054481/work/bindings/python/blst.py", line 12, in <module>
    import _blst
ImportError: dlopen(/Users/runner/Miniforge3/conda-bld/blst-split_1725117054481/work/bindings/python/_blst.cpython-39-darwin.so, 0x0002): symbol not found in flat namespace (_BLS12_381_G1)

Incidentally, I read a bit of the discussion on the better build system. The reason for needing blst within conda is that it is used by another project ckzg, which vendor blst as a statically linked library. It is typically preferred to separate the packages within conda, thus my the "public-purposing" of blst. We actually only need the dynamic lib, but for sake of completeness, I endeavored to provide your complete project

If you wish to keep an eye on how it is provided, let me know so I can add you as a maintainer

dot-asm commented 1 month ago

I fail to reconcile the parts of the report. The failure suggests that you're executing python/run.me, but it doesn't generate decorated _blst.-<os/arch>.so, just plain _blst.so. The way Python loads shared modules is to list *.so and choose decorated ones over non-decorated ones. Which raises the question, where does decorated name come from? Is it possible that it's an artefact of some failed experimentation on your part? In which case just removing it should mitigate the failure. But if you want run.me generate decorated .so, then test

--- a/bindings/python/run.me
+++ b/bindings/python/run.me
@@ -33,13 +33,15 @@ if newer("../blst.swg", "../blst.h", "../blst.hpp", "blst_wrap.cpp"):
                                    "-o", "blst_wrap.cpp", "-outdir", ".",
                                    "../blst.swg"])

-if newer("blst_wrap.cpp", "../libblst.a", "_blst.so"):
+blst_so = "_blst"+sysconfig.get_config_var('EXT_SUFFIX')
+
+if newer("blst_wrap.cpp", "../libblst.a", blst_so):
     print("compiling _blst.so...") or sys.stdout.flush()
     if sysconfig.get_config_var('MACHDEP') == 'darwin':
         cmd = ["c++", "-bundle", "-undefined", "dynamic_lookup"]
     else:
         cmd = ["c++", "-shared", "-Wl,-Bsymbolic"]
-    cmd.extend(["-o", "_blst.so", "-fPIC", "-fvisibility=hidden",
+    cmd.extend(["-o", blst_so, "-fPIC", "-fvisibility=hidden",
                 "-I"+sysconfig.get_config_var('INCLUDEPY'), "-I..",
                 "-O", "-Wall", "blst_wrap.cpp", "../libblst.a"])
     try:

Secondly, I fail to draw the connection between what ckzg does and the issue. By this I mean following. _blst.so generated by python/run.me is not a blst library, it's Python interface. Most notably, if you consider symbols exported by it, you'll spot just one, PyInit__blst. In other words it's self-contained and private enough to not cause any problems with anything. What is the concern more specifically?

MementoRC commented 1 month ago

Thank you for the reply. I should have closed this issue, there was a typo in the script for the extension of the dynamic library, which incidentally was not loaded and thus missed the symbol. It took me quite a while to find how silly the mistake was. The recipe does use the patch you suggested above.

Correct, there is no issue with blst c-lib used by ckzg, but given that we created the recipe for it, we felt that it would be useful to also provide a recipe for the blst python binding that is part of this repo.

The recipes are now all building correctly, but are pending review by the conda-core team.

Are you willing to be a co-maintainer of the recipe? If so, I will add you and ping you for comment on each PRs

MementoRC commented 1 month ago

FYI: https://github.com/conda-forge/staged-recipes/pull/27476, https://github.com/conda-forge/staged-recipes/pull/27317

Some notes:

dot-asm commented 1 month ago
  • windows build uses gcc, but there were some issues integrating the MASM (we added NASM to refresh.sh, but are not using them yet)

If you use gcc (or clang) on Windows, you don't need MASM, just call gcc (or clang) to compile build/assembly.S.

MementoRC commented 1 month ago

Right, that's the current recipe. If there's not improvement with using NASM, then we'll remove it. Thank you

dot-asm commented 1 month ago

FYI: conda-forge/staged-recipes#27476, conda-forge/staged-recipes#27317

Why bother with shared blst? Are you concerned with duplicating ~150KB? In these times? Just build Python interfaces with embedded blst, it's way more manageable and sustainable.

MementoRC commented 1 month ago

conda guidelines is to provide shared libraries. A recipe will build, publish and maintain historical versions whenever there is a repo tagged release, so it can quickly add-up (that is my guess for why)

dot-asm commented 1 month ago

conda guidelines is to provide shared libraries.

Well, if the goal is to maximize the maintenance costs, then all right :-) But on a more serious note, if you have a point of binary distribution independent of the OS vendor, it is more cumbersome for all involved parties to maximize the amount of shared libraries. This is not to say that there is no place for extra non-Python shared libraries, but there ought to be a line. And assertion is that everybody would be better off embedding blst into a Python shared library. It's sufficiently small and tightly tied to the corresponding bindinds anyway. In other words you have to ask the question how likely is it that you'll find yourself recompiling blst but not its Python bindings. Or vice versa. And the answer is highly unlikely.

MementoRC commented 1 month ago

Disclaimer: You know more than me on the subject. You could discuss this with Uwe (reviewer on conda-forge/staged-recipes#27476) for a more fleshed out reasoning

In a sense, I am following the guideline and in this particular case, I had to provide blst c-lib to 2 different packages blst-py and ckzg. The latter being the primary reason, as well as the fact that I actually need ckzg as opposed to blst-py.. I saw some of your discussions related to blst not being a public use library, but ckzg uses it as such and maybe more will ... do not sell blst short (joking) ckzg: https://github.com/conda-forge/staged-recipes/pull/27490

My first conda package was coincurve, which used secp256k1 statically linked and it was quite an arduous process to meet the guidelines after my few trials were rejected, so, I created a separate libsecp256k1 and worked with coincurve owner to refactor his package

I think maybe the disconnect is that, conda packages typically refrain from vendoring other packages, and although your offering is a blst python, it vendors your blst c-lib, which in a conda sense are 2 different packages, for example one is dependent on python ABI, while the other is not, so blst python requires at least 15 packages (3 os x 5 python versions, more for pypy and x-compile), while blst c-lib only 3 (linux, win, osx) - conda boast 27K packages, just here we're talking 15x150K vs 3x 150K - cross-compile adds another 4 OSes, but that goes for both

The goal IS to minimize maintenance cost, which is why it would be great if you would consider the needs of the recipes and provide some flexibility upstream, rather than rely upon the many patches that I added

MementoRC commented 1 month ago

Also note that I am not a conda expert, so all my reference to conda are in fact, my understanding of it, which is likely poor

dot-asm commented 1 month ago

For reference, d82c533 makes it possible to build fat binaries on MacOS and cf75400 actually builds fat shared object (in addition to working with MinGW).

MementoRC commented 1 month ago

Cool. Currently, the conda-forge reviews have slowed down, so depending on when you release/tag this update, I may update the recipe prior to having it approved and go live. Thks

dot-asm commented 1 month ago

First of all, the following remark shouldn't be viewed as a change of my position on shared builds, I still argue that embedding blst into a specific application or another shared module such as Python plugin is more appropriate. But I can't control other people.

it would be great if you would consider the needs of the recipes and provide some flexibility upstream, rather than rely upon the many patches that I added

Looking at conda-forge/staged-recipes#27476. I see no reason for patching blst.sh [or blst.def]. For starters, since you are copying the resulting files to target location you can choose that opportunity to choose the name of the shared library to your liking. As opposed to trying to modify build.sh to generate specific names that is. As for Linux. It's as important to pass -soname to linker, this can be achieved by passing -Wl,-soname=myfavoritename.so.N as an additional command line argument to build.sh. As for Windows, I'd suggest to generate the import library with dlltool -D myfavorite.dll -d build/win64/blst.def -l import.lib [as a] post build [step].

As for conda-forge/staged-recipes#27317. Well, run.me wasn't actually meant to be a build tool, just a proof-of-concept for somebody to figure out and contribute a more Python-friendly procedure... This is why it makes overly specific assumptions such as compiler availability... But if a build script is actually the best one can do, then it would be more sensible to split it into build.py and run.me (as in say Java)...

MementoRC commented 1 month ago

Thank you for the feedback on the recipes. blst-py is not actually needed, if it is a proof-of-concept, it may indeed bee wiser to not attempt a botched packaging

dot-asm commented 1 month ago

run.me script is a proof-of-concept, the bindings themselves are not. I thought that conda is a Python package manager...

MementoRC commented 1 month ago

My understanding of conda is that it was focused on python at inception, but it is now a more inclusive virtual environment manager. It includes packages written in many different languages, interpreted or compiled