NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
17.21k stars 13.49k forks source link

Make a libpq package #61580

Open nh2 opened 5 years ago

nh2 commented 5 years ago

Right now nixpkgs only has postgresql, which contains much more than libpq.

By offering a libpq package like other distros, the closure size of many packages could be reduced significantly.

Even more so because postgresql currently depends on systemd. And that also means that no application or library using libpq can be built against musl, as systemd relies on glibc-only features and thus does not build with musl.

Tasks:

nh2 commented 5 years ago

CC @matthewbauer @peti for Haskell.

CC @FRidh for Python, would Python benefit from this?

matthewbauer commented 5 years ago

I thought the postgresql.lib output would do this for us?

7c6f434c commented 5 years ago

@matthewbauer yes for runtime closure size, no for systemd

thoughtpolice commented 5 years ago

FWIW, regarding closure size, I considered splitting this up before when I did significant Postgres work last year but ultimately went against it because multi-output .lib is supported for all Postgres expressions -- which does reduce the size a bit and worked well enough for dependent expressions.

That said something containing "only" libpq.so (for example, via an override to create a new expression/derivation) would be much smaller, certainly, but it's not clear that's the best use of work or time -- right now .lib for postgresql.lib (9.6) is barely 6mb and that's mostly due to it containing most of the default server extensions that are shipped with the build system (libpq.so is barely 200KB). Moving these out would probably reduce this quite a lot -- for example, they could just be part of .bin instead.

As for dependencies, well, they're pretty slim too:

a@link> nix path-info -rS ./result-lib
/nix/store/57i8nqp2vljg5h61gvk4p1ar9vzhxayj-libossp-uuid-1.6.2             29565976
/nix/store/5p8qcz6i13ymgf80kimhx2g9pb5r5nia-bash-4.4-p23                   29463504
/nix/store/lhngda6cn951z68v5rlh4dni6bb5sjw1-zlib-1.2.11                    28279616
/nix/store/s5sgkvhy2wdgn7fmsvq47srxjhfp9ld9-libxml2-2.9.9                  29985080
/nix/store/w7d4sfgxjpalphg3h9bink47yzhf4f96-postgresql-9.6.12-lib          41456208
/nix/store/xqs95fqkjb1kd102yjv5h5q57gcsafb3-glibc-2.27                     28147008
/nix/store/yglqxz2as4ygai26d2hbfxqpf0shdh4l-openssl-1.0.2r                 31880056

But why do we need them?

a@link> nix path-info -r ./result-lib | grep -v postgresql | while read l; do nix why-depends ./result-lib "$l"; done

/nix/store/w7d4sfgxjpalphg3h9bink47yzhf4f96-postgresql-9.6.12-lib
╚═══lib/uuid-ossp.so: …LIBC_2.2.5.GLIBC_2.4./nix/store/57i8nqp2vljg5h61gvk4p1ar9vzhxayj-libossp-uuid-1.6.2/lib:/nix/sto…
    => /nix/store/57i8nqp2vljg5h61gvk4p1ar9vzhxayj-libossp-uuid-1.6.2
/nix/store/w7d4sfgxjpalphg3h9bink47yzhf4f96-postgresql-9.6.12-lib
╚═══lib/uuid-ossp.so: …LIBC_2.2.5.GLIBC_2.4./nix/store/57i8nqp2vljg5h61gvk4p1ar9vzhxayj-libossp-uuid-1.6.2/lib:/nix/sto…
    => /nix/store/57i8nqp2vljg5h61gvk4p1ar9vzhxayj-libossp-uuid-1.6.2
    ╚═══bin/uuid-config: …#!/nix/store/5p8qcz6i13ymgf80kimhx2g9pb5r5nia-bash-4.4-p23/bin/sh.##.##  OSSP…
        => /nix/store/5p8qcz6i13ymgf80kimhx2g9pb5r5nia-bash-4.4-p23
/nix/store/w7d4sfgxjpalphg3h9bink47yzhf4f96-postgresql-9.6.12-lib
╚═══lib/pgcrypto.so: …LIBC_2.4.GLIBC_2.2.5./nix/store/lhngda6cn951z68v5rlh4dni6bb5sjw1-zlib-1.2.11/lib:/nix/store/yglq…
    => /nix/store/lhngda6cn951z68v5rlh4dni6bb5sjw1-zlib-1.2.11
/nix/store/w7d4sfgxjpalphg3h9bink47yzhf4f96-postgresql-9.6.12-lib
╚═══lib/pgxml.so: …2.2.5.LIBXML2_2.4.30./nix/store/s5sgkvhy2wdgn7fmsvq47srxjhfp9ld9-libxml2-2.9.9/lib:/nix/store/xq…
    => /nix/store/s5sgkvhy2wdgn7fmsvq47srxjhfp9ld9-libxml2-2.9.9
/nix/store/w7d4sfgxjpalphg3h9bink47yzhf4f96-postgresql-9.6.12-lib
╚═══lib/_int.so: …LIBC_2.4.GLIBC_2.2.5./nix/store/xqs95fqkjb1kd102yjv5h5q57gcsafb3-glibc-2.27/lib.XXXXXXXXXXXXXXXX…
    => /nix/store/xqs95fqkjb1kd102yjv5h5q57gcsafb3-glibc-2.27
/nix/store/w7d4sfgxjpalphg3h9bink47yzhf4f96-postgresql-9.6.12-lib
╚═══lib/libpq.so.5.9: …LIBC_2.3.4.GLIBC_2.3./nix/store/yglqxz2as4ygai26d2hbfxqpf0shdh4l-openssl-1.0.2r/lib:/nix/store/x…
    => /nix/store/yglqxz2as4ygai26d2hbfxqpf0shdh4l-openssl-1.0.2r

bash, libuuid, zlib, and libxml are needed for auxiliary extensions. So we could actually achieve significant reductions here by just moving the server extensions out of the .lib output, which would yield nothing except glibc and openssl, I believe -- but these dependencies should be considered "practically free" since you always need them for anything at all. As a result you're looking at a realistic closure size of < 1mb with that change.

Therefore, I think a much better use of time would be to just slim down postgresql.lib and call it a day. And if that doesn't fix the itch, we can always alias libpq = postgresql.lib somewhere in the top-level and say "we have libpq"

7c6f434c commented 5 years ago

@nh2 Does overriding systemd argument to null resolve the first point?

nh2 commented 5 years ago

@nh2 Does overriding systemd argument to null resolve the first point?

I don't know, but I'd rather make this an explicit flag as done in #61581 than using null.

nh2 commented 5 years ago

@thoughtpolice Thanks for your analysis, that is good to know.

I think a much better use of time would be to just slim down postgresql.lib and call it a day.

That is not enough for musl support though because with the current setup where we build all of postgresql with --with-systemd, and then throw away everything that's not lib, it it fails to compile already before we can throw away anything.

So in my proposal, a libpq package would be built with enableSystemd = false.

It seems a good idea to me to give that "postgres without systemd build dependency" thing a name, and I found libpq the easiest solution because it solves both problems (buldability and client apps not depending on any postgresql server bits) in one go.

thoughtpolice commented 5 years ago

I don't follow the logic here. You want a systemd-less version of postgresql so you can use Musl for your stdenv and built packages. libpq actually does not need systemd, only the .bin outputs do, but they do not work with Musl. So if this was turned off or libpq was recompiled separately, you could avoid it. Therefore, libpq would override enableSystemd = false at all times and everything would link against it, but:

In other words: why should upstream glibc users, Hydra, Borg, etc always compile a second postgresql derivation with this feature disabled, and a substantial amount of changes to all postgres-dependent extensions? When the musl use case requires you recompiling it anyway.

Put another way -- what use cases does this particular scheme make easier/possible that aren't currently possible, and why is this the way it must be done?

Keep in mind that third parties outside use postgresql.lib as a buildInput as well, so this effectively creates a schism between upstream and any third party uses (now you have to override them awkwardly if you want upstream/a third party to behave the same, etc). Unless you just completely remove postgresql.lib and break downstreams, I guess. (It would actually be worse than that, I think -- and would result in confusing build-time errors, because mkDerivation doesn't consider the lack of a .lib to be a problem, it just assumes the files must be in .out instead, AKA it's not multi-output -- but those files won't be there!)


I'm not convinced this is a good plan yet. IMO it would be strictly better to simply disable systemd support unilaterally if stdenv.hostPlatforms.isMusl == true and simply put that in the default generic expression, and also clean up .lib as I explained earlier. Then:

I may be misunderstanding something, but this seems somewhat fishy as an approach, to me. Also, the whole phrasing of "introduce libpq so I can use musl" seems like a classic X-is-Y problem... The actual problem is not that libpq must be separate -- it's that postgresql currently has dependent inputs that are incompatible for you. That's a much narrower problem to solve with a fix that is substantially easier, and much less invasive.

nh2 commented 5 years ago

IMO it would be strictly better to simply disable systemd support unilaterally if stdenv.hostPlatforms.isMusl == true and simply put that in the default generic expression, and also clean up .lib as I explained earlier.

@thoughtpolice What you propose sounds like a good plan to me.

One question though:

Are there situations where some programs may want to use only the postgresql server libraries?

E.g. should there be postgresql.lib (being essentially libpq, as you suggest), and e.g. postgresql.server-lib (for programs that use server libs but don't need the binaries)? Or is that an uncommon use case that can be handled later, if somebody needs it in the future?

thoughtpolice commented 5 years ago

I am not sure about that in the case of first-party things shipped with PostgreSQL. I do not think there's any reasonable way for someone to interface with/link against those meaningfully. They can only be used by CREATE EXTENSION in the server process.

For certain third party extensions, I know they do ship libraries and headers that can be consumed by other extensions. But that case is handled fine: because those expressions will act normally, have .lib and .dev outputs, and be part of your buildInputs if you need them.

I don't think there are any realistic use cases where someone would want all the server libs, but not the binaries in .bin. And if you did -- you'd probably want the include files anyway, but there's another catch here which is that they are inside .bin too, not in a .dev output. So having a .server-libs output is a net loss: you still need .bin anyway for the header files.


So, if we cleaned up things as I mentioned earlier -- by putting server extensions in .bin -- and someone did want them and did want to use them in some theoretical scenario, we can just say "use the .bin output in your buildInputs and it will work". No need for a separate .server-libs.

But I think that's a particular, rare-or-impossible enough bridge that we can cross that one when (if) we get to it.

matthewbauer commented 5 years ago

I don't think this is a good idea. systemd.lib is only 1.8M and seems like a reasonable thing to have a runtime dependency on:

$ du -sch $(nix-store -qR /nix/store/558zw2lrdiqwgy0i5vq779j700iph882-postgresql-9.6.12-lib)
152K    /nix/store/r4bbg2bjmsvyachy2mn97xri6bcib4lj-zlib-1.2.11
160K    /nix/store/kq8jb3g4xzfbc3d7m818ri3293j2nsy3-libossp-uuid-1.6.2
1.4M    /nix/store/2ch3pwfv40ls6qhdmzwsz9wky9pq6cyl-bash-4.4-p23
1.7M    /nix/store/hsh8afbalh1blsj2s4n8g2dvaj2j5aai-libxml2-2.9.9
3.6M    /nix/store/2wj5dw9sslnqxkznyzp8jhic8ywybs72-openssl-1.0.2r
6.2M    /nix/store/558zw2lrdiqwgy0i5vq779j700iph882-postgresql-9.6.12-lib
30M /nix/store/siks2gcfwx6qwh27m7c5r5lixcr621bd-glibc-2.27
43M total

Splitting packages into multiple parts makes things much more confusing to reason about. Just because other distros do it doesn't mean it's a good idea. Multiple outputs are there to avoid this mess. We should use package splitting as a last result!

On the other hand, disabling systemd support when you are building with Musl seems reasonable. But it would be much better to just fix musl support in systemd.

thoughtpolice commented 5 years ago

@matthewbauer FWIW, see #61581 for my proposed changes to @nh2's proposed changes -- which effectively do what you ask (systemd is always available, unless stdenv.hostPlatform.isMusl == true || stdenv.isDarwin).

I also do not think systemd upstream has any interest in building with musl; some googling leads to various patches which seem to be rather invasive, and I think it would be better to avoid cluttering our own systemd patchset with a ~dozen more. It's possible to get a subset of systemd's library code working with musl (a lot of the most useful stuff like socket activation helper code, sd_log etc are all in just a single header file), but anything that truly needs libudev or libsystemd is going to be tricky.

danbst commented 5 years ago

I can think of another good reason of splitting postgresql and postgresql_lib. Currently we have postgres = postgresql96, and we can't bump it because it will bump server version for those, who use postgresql as server. Well... we can bump it, but that will be breaking change for many (last time I've checked there were tens of direct pg server usages through postgresql)...

If we could have postgresql_lib as separate top-level attribute, we could bump it's major version every year, as libpq has backwards compat. Is there any need for libpq9.6 when we already have libpq11?

danbst commented 5 years ago

I wrote "splitting" but in fact I just want a separate top-level attribute for libpq. Which should be libpq = postgresql_latest.lib;. So I'm all for multiple outputs, just libpq deserves a separate reference.

stale[bot] commented 4 years ago

Thank you for your contributions.

This has been automatically marked as stale because it has had no activity for 180 days.

If this is still important to you, we ask that you leave a comment below. Your comment can be as simple as "still important to me". This lets people see that at least one person still cares about this. Someone will have to do this at most twice a year if there is no other activity.

Here are suggestions that might help resolve this more quickly:

  1. Search for maintainers and people that previously touched the related code and @ mention them in a comment.
  2. Ask on the NixOS Discourse.
  3. Ask on the #nixos channel on irc.freenode.net.
nh2 commented 4 years ago

still important to me

stale[bot] commented 3 years ago

I marked this as stale due to inactivity. → More info

nh2 commented 3 years ago

still

teburd commented 3 years ago

I really don't like that libpq is tied to the entirety of the postgresql database. You may not need the server or client, while software you use may need libpq to connect to postgres. This shouldn't require installing the entirety of postgresq for one, nor should building software that need postgres require systemd be built for cross compile targets like armv7 musl.

teburd commented 3 years ago

I've sort of hacked up a nix file for this, its not perfect though by any means, but I'll submit it for PR if it looks decent to others

Can safely ignore things outside of stdenv.mkDerivation below for the PR. I was unable to get static linking against libpq working like I wanted however, which was a bit disappointing.

{ pkgs ? import ../nixpkgs.nix {} }:

# based on the conan recipe for libpq

pkgs.callPackage({ stdenv, lib, fetchgit, openssl, zlib, perl, bison, flex, tzdata }:
  stdenv.mkDerivation rec {
    pname = "libpq";
    version = "12.5";

    src = fetchgit {
      url = "https://git.postgresql.org/git/postgresql.git";
      rev = "6bb1b38fa5388a4aa39ed9e56ef477f618fb28e1"; # REL_12_5
      sha256 = "0vcjm2yvdqnsxmj675rfbsh5awzb7bi3rm93ydy0230rg4zr7jc8";
    };

    configureFlags = [
      "--with-openssl"
      "--without-readline"
      "--with-system-tzdata=${tzdata}/share/zoneinfo"
    ];
    nativeBuildInputs = [ perl bison flex tzdata ];
    buildInputs = [ openssl zlib ];
    buildPhase = ''
      runHook preBuild

      make -C src/backend generated-headers
      make -C src/common
      make -C src/port
      make -C src/fe_utils
      make -C src/include
      make -C src/interfaces/libpq
      make -C src/bin/pg_config

      runHook postBuild
    '';

    installPhase = ''
      runHook preInstall

      make -C src/common install
      make -C src/port install
      make -C src/fe_utils install

      make -C src/include install
      make -C src/interfaces/libpq install
      make -C src/bin/pg_config install
      mkdir -p $dev/{bin,lib}
      mv $out/bin/pg_config $dev/bin/pg_config
      rm -rf $out/lib/*.a
      #mv $out/lib/*.a $dev/lib/
      rm -rf $out/include/postgresql/server
      rm -rf $out/bin
      rm -rf $out/share

      runHook postInstall
    '';
    # TODO copy include/catalog
    # TODO rmdir include/postgresql/server
    # TODO rmdir share

    outputs = ["out" "dev"];
  }
) {}
stale[bot] commented 3 years ago

I marked this as stale due to inactivity. → More info

vmsp commented 3 years ago

I'd also like this.

stale[bot] commented 2 years ago

I marked this as stale due to inactivity. → More info

teburd commented 2 years ago

Not stale, still care

coderfromhere commented 2 years ago

Does postgresql.lib include psql bin util at the moment? In most dev environments I worked with, libpq almost always assumed psql for debuggin purposes. But overall I'd like to see client utilities be separate from the server and lib derivation.

alexvorobiev commented 2 years ago

@teburd Are you planning a PR with your approach? I am also very interested in a separate derivation for libpq. Another thing I am looking into is symbol versioning for libpq that other distributions (RH) use which makes nipkgs libpq incompatible with patchelf-ed binaries built for those distributions.

harendra-kumar commented 1 year ago

I am interested in this for a static libpq build.

YoshiRulz commented 1 year ago

I think the first task in the OP can be checked off?

szlend commented 1 year ago

The postgresql.lib output also doesn't include pkg-config definitions which can screw up builds from non-sandboxed environments like the devshell.

szlend commented 1 year ago

In my opinion, a separate libpq derivation still makes the most sense. Here's why:

I opened a Draft MR based on @teburd's work to hopefully move something forward: https://github.com/NixOS/nixpkgs/pull/234470

rupurt commented 1 year ago

Well said @szlend. Couldn't agree more