crystal-lang / crystal

The Crystal Programming Language
https://crystal-lang.org
Apache License 2.0
19.4k stars 1.62k forks source link

Shipping libevent.a and libpcre.a with the linux binary packages #9285

Closed lh3 closed 3 years ago

lh3 commented 4 years ago

It seems that the crystal compiler requires libevent and pcre. If either library is absent, the linker will throw an error like

/usr/bin/ld: cannot find -levent (this usually means you need to install the development package for libevent)

Unfortunately, libevent and pcre are often missing from a system and installing/compiling them without root/sudo is non-trivial for inexperienced users.

A simple solution to this problem is to compile the two libraries as static libevent.a and libpcre.a on older Linux and put them in crystal-0.34-1/lib/crystal/lib along with libgc.a. This way, we can compile without system libevent and pcre. Furthermore, the resulting binary is not dynamically linked to libevent.so or libpcre.so – it is more portable.

PCRE and libevent don't have additional dependencies. They can be easily compiled as static libraries with ./configure --disable-shared. On my system, the compiled libraries can be found in .libs/lib{event,pcre}.a. I have tried this approach. It is working well for me.

refi64 commented 4 years ago

This would potentially be quite a mess once you try to move it across distros...

On Wed, May 13, 2020, 12:10 AM Heng Li notifications@github.com wrote:

It seems that the crystal compiler requires libevent and pcre. If either library is absent, the linker will throw an error like

/usr/bin/ld: cannot find -levent (this usually means you need to install the development package for libevent)

Unfortunately, libevent and pcre are often missing from a system and installing/compiling them without root/sudo is non-trivial for inexperienced users.

A simple solution to this problem is to compile the two libraries as static libevent.a and libpcre.a on older Linux and put them in crystal-0.34-1/lib/crystal/lib along with libgc.a. This way, we can compile without system libevent and pcre. Furthermore, the resulting binary is not dynamically linked to libevent.so or libpcre.so – it is more portable.

PCRE and libevent don't have additional dependencies. They can be easily compiled as static libraries with ./configure --disable-shared. On my system, the compiled libraries can be found in .libs/lib{event,pcre}.a. I have tried this approach. It is working well for me.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/crystal-lang/crystal/issues/9285, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAM4YSJPSRRHDF6L5Z5YMVDRRITVVANCNFSM4M7NKHGQ .

asterite commented 4 years ago

@lh3 how are you installing Crystal?

jhass commented 4 years ago

This seems like an issue for https://github.com/crystal-lang/distribution-scripts

I hope with 1.0 more distros will accept us into their repositories and this will become less of an issue.

I'm happy to be on a distro where there's an official package already, so I don't have to rely on the big workaround that the above linked scripts essentially are :D

RX14 commented 4 years ago

Installing many things without root access is tricky. I don't think crystal is special in this regard - or even has particularly unusual dependencies.

I'd rather use system libraries where at all appropriate, since it vastly reduces incompatibilities on systems with versioned glibc symbols.

lh3 commented 4 years ago

@asterite I am using a distributed binary. More exactly: file crystal-0.34.0-1-linux-x86_64.tar.gz from the release page.

@jhass I am not familiar with distribution-scripts. My end goal is to be able to download a self-contained binary package such that my users and myself don't have to install additional libraries to use crystal. >90% of my users don't have root privilege on the machines they run tools and they don't know how to install libevent or pcre by themselves.

@RX14 and @refi64 I understand glibc is tricky. A common practice is to compile binaries on CentOS6, which is one of the oldest but still maintained distros. Then the compiled binary will work on recent systems. I haven't seen exceptions so far, though maybe I haven't used enough distros. Building on CentOS6 is essentially how anaconda ships portable binaries.

You can find precompiled static libraries for libevent and pcre on my FTP site: ftp://ftp.dfci.harvard.edu/pub/hli/crystal-libs/. You can download the two files, copy to crystal-0.34.0-1/lib/crystal/lib and see if they work on your system. If there are no compiling errors, you may use ldd (on Linux). You should see that the resulting binary doesn't dynamically linked to libevent and libpcre any more.

asterite commented 4 years ago

@lh3 the releases page is just there for package managers and others to be able to build distributions. To install crystal you should use your OS distribution system. For example in mac if you use homebrew everything works out of the box.

lh3 commented 4 years ago

@asterite On mac, everything is simple. The problem is linux. Most of my users don't use linuxbrew. I can't force them to install linuxbrew just for crystal.

Blacksmoke16 commented 4 years ago

He means based on their actual OS. See https://crystal-lang.org/install/, which should cover the majority.

lh3 commented 4 years ago

@Blacksmoke16 Could you elaborate? I said my users don't have root privilege (PS: I don't have sudo permission, either) – this is common among computing environments in academia. They are unable to install packages by themselves.

Blacksmoke16 commented 4 years ago

To install crystal you should use your OS distribution system. For example in mac if you use homebrew everything works out of the box.

More so meant this part. I.e. if on Mac you would use homebrew. Arch or Manjaro would use pacman, etc.

this is common among computing environments in academia

I imagine they would need an admin to install it then? or maybe build from source (although would prob run into the same problem when trying to install llvm for example :/).

lh3 commented 4 years ago

@Blacksmoke16 In an academia setting, admin is often non-responsive. That is why conda is becoming popular. I have already given a solution above which conda uses to some extent. I have experiences in dealing with the issue. I am reasonably certain it will work. You can try that.

although would prob run into the same problem when trying to install llvm for example

I can install crystal using the binary package from the release page. If someone can add the two *.a files to the binary package, my problem is solved, and I think it will work for others.

waj commented 4 years ago

We used to release crystal binaries using omnibus. Those packages contains libevent, libpcre and other libraries that are linked by some part of the standard library. @lh3 maybe you need just those two because are required by default, but libyaml must be added if YAML api is used for example.

Maybe someone can enlighten the conversation by providing references to where the decision to change was made, so the reasons are not exposed here all over again. I agree with that change anyway but probably there is still space for omnibus generated packages for some users. The package didn't contain actual linker and system libraries, so a build environment is still required.

lh3 commented 4 years ago

@waj Thank you for the information. I appreciate. I will add one minor note. The binary package on the release page ships libgc.a because it is essential to crystal. libevent.a and libpcre.a are actually also essential in that we can't compile a "hello world" program without them.

jhass commented 4 years ago

Maybe educating your users to use a non-root package manager/distribution form such as linuxbrew, nix, stow, toast, junest, 0install, pkgbrew, zpkg, AppImage, Flatpak etc. could prove useful outside this single instance, solving this problem more generically?

Rather than reintroducing Omnibus (it was quite the pain to maintain I think), I would look into making Flatpak and/or AppImage official distribution methods.

waj commented 4 years ago

Julia? 🤔🙂

libgc.a is shipped because we added some patches for MT and are not yet released. Also some distributions ship older versions we couldn't use with Crystal.

lh3 commented 4 years ago

@jhass I understand your concerns. If it is indeed tricky to add these two files, I will probably provide my users with my own binary crystal package with the files. Another less-ideal (to me) solution is to provide a crystal package via conda. It is much more popular than linuxbrew and other package managers in my field.

@waj Oh, sorry. It is a typo. I am playing around several newer languages to see what to recommend to non-CS researchers (I can't tolerate the performance of python). So far crystal is the top on my list. I think it has a better and cleaner design than Nim and Julia. Libevent.a/libpcre.a is the last bit that has been stopping me.

jhass commented 4 years ago

Ah, I didn't realize Anaconda is essentially a package manager, it's frontpage tells me nothing about what it really does :D

To me that sounds like a good solution, we should just have Crystal packages for everything :) But of course that's impossible to do by the core team and we need the help of respective communities to maintain such packages. I see our responsibility as making that possible and easy. Maintaining a Crystal package in the AUR was essentially one of my first contributions to Crystal and it lead to Archlinux to be one of the first (if not the first) distribution shipping Crystal in its repositories :)

lh3 commented 4 years ago

Ok. I will explore how to add a crystal package to conda. I need to seek help from the conda community as I am not that familiar with the conda build system, either. This will take time.

Meanwhile, I still hope you may consider to add necessary static libraries to the x64-linux tar.gz. Doing this will benefit more users. I really appreciate that you build a statically linked crystal binary, which makes things much easier. Shipping a few more static libraries would be perfect. For the time being, I will provide my own x64-linux crystal package by adding libevent.a and libpcre.a.

bcardiff commented 4 years ago

@RX14 do you recall the rational behind dropping those libraries in distribution-scripts with respect what was done in the omnibus originally?

RX14 commented 4 years ago

Because gc isn't packaged - or new enough - in every distro. We got actual errors from users when they were using the distro libgc package. The same was not true with pcre or libevent - people installed the distro package and it just worked. Now, the reason is because we ship a patch libgc for MT.

I agree the solution is conda or similar. If people don't have root, they need a user-mode package manager. conda, nix and linuxbrew fill that role.

lh3 commented 4 years ago

@RX14 Thanks for the explanation. I respect your decision. Just curious: have libevent.a and libpcre.a caused troubles when they were shipped with the binary tar.gz?

straight-shoota commented 4 years ago

I don't think the reason they were dropped is that they caused trouble, but they're simply non-essential. It's better not to drag that baggage in the core project when it's not necessary.

asterite commented 4 years ago

But they are not baggage. They are essential to every crystal program. That's the reason why we distributed them statically. I didn't know that changed.

Sure, libyaml is required if you use yaml, same goes for libgmp for BigInt and such, and maybe we should include those.

RX14 commented 4 years ago

They are baggage because they're more things we need to maintain.

There's a fairly rare usecase (not having root access) where this is a problem, so I wouldn't rush to change this.

straight-shoota commented 4 years ago

@asterite I don't mean the libraries themselves are baggage but maintining their distribution with crystal requires a lot of additional work. Of course, if we can get a full batteries-included distribution package without much effort, I'm all in.

Even not having root access is not really a problem. You can get the libraries without root. There are specific non-root package managers. Or you can just download it.

asterite commented 4 years ago

But aren't these libraries already available for download? We would just need one step in the build process which is to download them and put them in the tar.gz

lh3 commented 4 years ago

There's a fairly rare usecase (not having root access)

Probably >90% users in the field of bioinformatics don't have root access. I am reasonably certain that quite a few other fields in academia have the same problem.

maintaining their distribution with crystal requires a lot of additional work.

You can compile the two libraries once and copy them to future binary packages. You may only upgrade them occasionally.

Or you can just download it.

The purpose of providing statically linked crystal binary is to make it run on every linux. Not shipping essential libraries with crystal has defeated the goal.

RX14 commented 4 years ago

You may only upgrade them occasionally.

And we have to track them for security issues and make sure we keep them fairly well up to date. libpcre especially is a CVE quagmire as it JITs code.

If someone else in the core team wants to maintain this then it's on them, but I think the risk is fairly high and the benefits are low. So I'll duck out of this conversation.

lh3 commented 4 years ago

And we have to track them for security issues and make sure we keep them fairly well up to date.

The crystal compiler is mostly written in crystal. When you compile the static crystal binary, you need to keep track of the security issues in crystal dependencies anyway.

I am trying to promote crystal in my field. I just think having the suggested change will make me much easier to recommend crystal. Anyway, thank you for your time. I appreciate.

lh3 commented 4 years ago

Due to the way crystal is distributed, I don't think I have a chance to persuade devs in my field to use crystal, so I am providing portable binaries through my repository for now.

I will talk to people who are familiar with conda. It will take time. Also, distribution through conda is not ideal, either, because all compiled executables will be dynamically linked to libevent and libpcre. This will complicate binary distributions of developers' tools. I can't tell users: "you have to install crystal, libevent and pcre to use my tool". This will push 90% of my users to other tools.

Again, for users, the best solution is to include static libraries in tar.gz. You need to keep these libraries updated anyway to compile the statically linked crystal binary. I don't see why this adds much burden. Anyway, I am closing the issue as you seem not interested.

j8r commented 4 years ago

@lh3 The problem is the same with any package in your system; how conda was present at first? Usually, systems where users are non-root asks to admins to install X on the system.

A good option is using non-root containers, like podman. Users will be free to do whatever they want without compromising the system.

Libraries risk to always lack: yaml, gmp, and more if they want to link to external C libraries.

lh3 commented 4 years ago

The problem is the same with any package in your system; how conda was present at first? Usually, systems where users are non-root asks to admins to install X on the system.

With conda, you download a miniconda installation script (web page), launch it as a normal user and then install other packages as a normal user. Conda is popular because endusers don't need the root permission at any steps. I will not go into the mechanism of conda here as it is not quite relevant. You have already done a great favor by providing a static binary.

A good option is using non-root containers, like podman.

Unfortunately, my users are biologists. They are often not tech-savvy and won't want to learn a new container.

Libraries risk to always lack: yaml, gmp, and more if they want to link to external C libraries.

Libyaml, libgmp and others are not essential to crystal. Some of them are probably not needed to compile the crystal compiler. I personally don't mind if you leave them out.

By the way, I said I wanted to promote crystal. Here is a blog post in that direction. The post is not focusing on crystal, but it basically says crystal is faster and more pleasant to use than Julia and Nim.

bcardiff commented 4 years ago

I think a self-contained tar.gz (or tar.gz variation) is a very valid use case. Although there are pros and cons exposed in this thread already I don't think those are definitive reasons to discard the idea.

I personally want to Crystal to be suitable for the academic and research environments so I like this kind of feedback @lh3. Thanks!

straight-shoota commented 4 years ago

Could we just copy the system-provided static libraries to the tar.gz in distribution-scripts?

lh3 commented 4 years ago

Could we just copy the system-provided static libraries to the tar.gz in distribution-scripts?

That's what I believe, though I don't know for sure. I am only sure that static libraries compiled on CentOS6 will work on essentially all actively maintained Linux distros. Here are libevent.a and libpcre.a (with UTF8 support) I am using: ftp://ftp.dfci.harvard.edu/pub/hli/crystal-libs/.

samuell commented 4 years ago

I personally want to Crystal to be suitable for the academic and research environments

Happy to hear this @bcardiff and would be awesome if it could be resolved.

Guys, if you get @lh3 to push for Crystal in his (and to some extent my) field, that is a big deal indeed.

I'm not close to the proficiency of Heng (who built some of the most well known tools in bioinformatics), but have some experience in Go concurrency, which I used to build a workflow manager aimed at bioinformatics users (https://scipipe.org) where Go's concurrency features enabled to make the code 3 times smaller than comparable python tools and even completely without dependencies.

I see enormous potential for Crystal in this area too as it supports the same concurrency features and would solve the two main problems with my Go tool: Too verbose and complicated syntax (and requiring lots of imports) for most bio-people, and the lack of generics for channel operations.

Having a portable binary distribution of crystal is crucial to make a pipeline library relevant as well. Probably a majority of bioinfo pipelines would run on HPC centers with - as mentioned - lack of root access. Home-folder installation need to be a no-brainer.

lh3 commented 3 years ago

Congrats on the v1.0 release! I know you were busy with the release. When you get time, it would be good to revisit this issue. Thank you.

winni2k commented 3 years ago

I just want to wade in here on @lh3's side with some on-the-ground anecdotes. Heng's blog comparing several languages for high-performance and efficient computational biology has made a splash in our field and been mentioned in a feature in one of our leading publications.

I am a researcher in computational biology, but I am also the de facto maintainer of our lab server. This is a common situation in many biology labs. I expect all the users in my lab to use conda because it allows easy reproducible research on our laptops (mac or windows, without sudo access), the lab server, and our local compute cluster (various flavors of Linux). Furthermore, using conda minimizes the effort I need to spend on installing and updating packages for my users and troubleshooting why rarely run scripts break sometimes long after a system library update.

Due to a reproducibility crisis in science, in my field a lot of effort is being spent on educating researchers on reproducible research, which includes using conda to keep track of dependencies. If you can't install a tool via conda or simple download, then uptake will be greatly reduced.

RX14 commented 3 years ago

Since the usecases for this have been mostly from the hpc/data science side, would this issue better be solved in conda?

My personal opinion is that this shouldn't be the default, but there's absolutely a usecase which needs to be solved here.

winni2k commented 3 years ago

Good point. I've actually done a bit of conda packaging over the years. If no one else has looked into it deeply yet, I could give it a shot and report back.

straight-shoota commented 3 years ago

That would be terrific @winni2k. As someone working in the field of this use case, you probably have a better angle than most other contributors. Let me know if you need any assistance, knowledge or act. The community chat rooms are also a good ressource to get help on really anything.

lh3 commented 3 years ago

The conda solution is not ideal for two reasons. First, not everyone is using conda. Second, binaries compiled with dynamic libevent and libpcre require endusers to have the two libraries installed. This hurts binary distribution.

The best solution is still on the crystal end: put libevent.a and libpcre.a in the binary distribution package. You already have the two libraries compiled as you need them to link the crystal executable statically. You only need to copy the two files to the crystal-1.0.0-1/lib/crystal/lib.

winni2k commented 3 years ago

For what it's worth, I tried building a conda package from both the 1.0.0 and 0.32.0 tar files on a CentOS/7 VM that does not have pcre installed. In both cases, I got segmentation faults when trying to compile a hello world example file through the test suite of conda build .. Here is a gist of the configuration files that I used. I'm not sure how to troubleshoot further, but perhaps someone else reading this issue might find it helpful.

Update However, there is a hacky workaround that I can use to install the crystal compiler into an arbitrary conda environment by manually moving the libpcre.a and libevent.a libraries into the right place as @lh3 suggested, just within the conda environment. I have created a bash script to automate setting up the environment. I think this means that building a conda env for crystal should be possible in principle.

lh3 commented 3 years ago

Just notice that crystal is providing pcre and libevent as static libraries on mac. If you can do the same to the Linux binary package, it would be great. You already have the mechanism.

straight-shoota commented 3 years ago

The distribution workflows for macOS and linux are completely different, so there's not really something to re-use. And the macOS ecosystem is simpler.

Basically we already have the static libraries in the linux build process because they're necessary for linking the compiler. But we're already shipping libgc.a and libcrystal.a and they're built separately on Debian, while the compiler itself is built and linked on Alpine. So this relates to the question whether the generic linux distribution package should or could be truely portable across different linux systems.

lh3 commented 3 years ago

Thanks for the explanation. This makes sense.

So this relates to the question whether the generic linux distribution package should or could be truely portable across different linux systems.

I am not an expert, but I believe the answer is yes. libevent.a and libpcre.a are as essential as libgc.a to crystal. Lacking the two libraries make the linux binary package non-functional.

RX14 commented 3 years ago

To provide insight: The historical reason the packages have included libgc.a and not libevent and pcre, is that the latter two are a lot more common on linux systems - especially pcre, and are always available on package managers. For OSX this hasn't been as easy without homebrew.

One reason I have been reluctant to support packaging pcre and libevent is that they have a lot of CVEs (pcre 48, libevent 5 to bdwgc's 1). Dynamically linking to distribution versions of pcre especially would make a lot of sense because keeping PCRE up to date after CVEs were published would be as simple as keeping up to date with security updates on your distro, instead of downloading the latest crystal version, and rebuilding your binaries.

I understand that in data science, untrusted inputs aren't as much of a problem, and I do want to work to find a solution, but we have to be careful not to degrade security of those using the linux tarball to compile other types of applications.

straight-shoota commented 3 years ago

IIRC the only reason we distribute libgc.a is because there hasn't been a release of bdwgc in two years and the patch we need for multithreading support is not available in system packages.

straight-shoota commented 3 years ago

A good solution might be to provide two different packages. A minimal one with bare minimum of libraries (libgc and libcrystal as long as we need them), and a batteries-included package with all necessary libraries at least for core lib. I would expect that the big package could easily build on the small, just adding in a few more lib files.

lh3 commented 3 years ago

the latter two are a lot more common on linux systems

On CentOS, libevent and pcre are essential packages and widely available. However, compiling crystal programs require libevent-devel and pcre-devel, the development packages. Looking at the rpm content of pcre-devel, I see it just adds a symbolic link libpcre.so to the right version, but we need the development packages anyway.

A good solution might be to provide two different packages.

This sounds good to me.