tweag / rules_haskell

Haskell rules for Bazel.
https://haskell.build
Apache License 2.0
265 stars 79 forks source link

Build fails when using ghc_plugin (undefined symbol) #1658

Closed aveltras closed 2 years ago

aveltras commented 2 years ago

Hi,

We are currently trying to integrate https://github.com/hetchr/pulsar-hs in our main repo but we are facing building issues which seem related to the combination of ghc_plugin and the fact that library uses FFI.

I've setup a minimal repository to illustrate the issue: https://github.com/aveltras/pulsar-hs-repro

When trying to build the example with pulsar-hs as a dependency AND a plugin enabled it fails with the following error indicating undefined symbols.

bazel run :example
DEBUG: /home/romain/.cache/bazel/_bazel_romain/ef9ab8a8b80e23dc12e1983db0dd2853/external/rules_haskell/haskell/private/versions.bzl:60:10: WARNING: bazel version is too old. Supported versions range from 4.0.0 to 4.2.1, but found: 3.7.2- (@non-git)
INFO: Analyzed target //:example (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
ERROR: /home/romain/Code/pulsar-hs-repro/BUILD.bazel:19:15: HaskellBuildBinary //:example failed (Exit 1): ghc_wrapper failed: error executing command bazel-out/host/bin/external/rules_haskell/haskell/ghc_wrapper bazel-out/k8-fastbuild/bin/compile_flags_example__HaskellBuildBinary bazel-out/k8-fastbuild/bin/extra_args_example__HaskellBuildBinary

Use --sandbox_debug to see verbose messages from the sandbox ghc_wrapper failed: error executing command bazel-out/host/bin/external/rules_haskell/haskell/ghc_wrapper bazel-out/k8-fastbuild/bin/compile_flags_example__HaskellBuildBinary bazel-out/k8-fastbuild/bin/extra_args_example__HaskellBuildBinary

Use --sandbox_debug to see verbose messages from the sandbox
<command line>: /nix/store/mdh6w3b6v6hv79zpxm6zryn59drv63qr-pulsar-client-hs-1.0.0/lib/ghc-8.10.7/x86_64-linux-ghc-8.10.7/libHSpulsar-client-hs-0.1.0.0-K3rKHHtOBPtGKrJXfaFY8Z-ghc8.10.7.so: undefined symbol: pulsar_message_set_schema_version
Target //:example failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 0.711s, Critical Path: 0.61s
INFO: 2 processes: 2 internal.
FAILED: Build did NOT complete successfully
FAILED: Build did NOT complete successfully

Removing polysemy-plugin gets the build to succeed. I've also tried adding another plugin so this doesn't seem related to polysemy-plugin but with ghc_plugin.

Not sure when the problem lies here but any help would be very welcome.

Edit: Seems the build also fails without using Bazel (also when activating any plugin in ghc-options)

Thanks

aherrmann commented 2 years ago

Edit: Seems the build also fails without using Bazel (also when activating any plugin in ghc-options)

IIUC it's also marked broken in nixpkgs. Perhaps it's an upstream issue? It would be good to have a working baseline to compare to (perhaps without Nix entirely).

Side note: Is there any particular reason why you're importing these packages in Nix instead of using stack_snapshot?

aveltras commented 2 years ago

To give a bit of context, the problem has been uncovered while trying to integrate our in progress library https://github.com/hetchr/pulsar-hs with our main repo built with Bazel and it seems to only occur when having a ghc plugin in the packages.

IIUC it's also marked broken in nixpkgs. Perhaps it's an upstream issue? It would be good to have a working baseline to compare to (perhaps without Nix entirely).

I also tried with another plugin ghc-typelits-natnormalise and the problem remained the same. Having also tested it without Bazel in the meantime, I saw that it's more likely a problem with the nix packaging of our library or maybe a bug with ghc (don't think so but can't be sure) since it also happens with Nix + cabal (as seen on the minimal repo).

We should be able to finalize the integration of this library in our main repo maybe this week, at which time I'll be able to see if a workaround around the described bug will work or not. I'll report here.

You may close this issue as it doesn't seem to be specifically tied to Bazel in the end.

Side note: Is there any particular reason why you're importing these packages in Nix instead of using stack_snapshot?

Not really, we just went with https://rules-haskell.readthedocs.io/en/latest/haskell-use-cases.html#building-cabal-packages-using-nix when we moved from Stack to Bazel for building our main repo. Should stack_snapshot be prefered ?

aherrmann commented 2 years ago

We should be able to finalize the integration of this library in our main repo maybe this week, at which time I'll be able to see if a workaround around the described bug will work or not. I'll report here.

You may close this issue as it doesn't seem to be specifically tied to Bazel in the end.

Thanks for the update! Yes, please let us know how it goes. I'll close for now. Feel free to reopen if it turns out that there is an issue in rules_haskell as well.

Should stack_snapshot be prefered ?

An advantage of stack_snapshot over importing from Nix is that it lets Bazel track packages individually. If you change or add or remove a package with the Nix approach you'll have a completely separate GHC toolchain as far as Bazel is concerned and you'll have to rebuild all your Haskell targets. With stack_snapshot you'll only have to rebuild those targets whose dependencies changed due to the update.

aveltras commented 2 years ago

To work around this issue, we had to remove the use of GHC plugins on all the targets above the ones using the FFI using library. That's a bit cumbersome but at least it seems to work for now.

aherrmann commented 2 years ago

Thanks for the update! Yes, it sounds cumbersome. Perhaps worth a ticket on the plugin project or perhaps GHC? I'm afraid I still don't have a good understanding where the issue lies in this case.