Closed everesio closed 5 months ago
this needs a bit more attention. wasted too much time on this. 🥲
Just try making the following changes
FROM --platform=linux/$TARGETARCH golang:1.20-alpine3.17 as builder
ARG TARGETARCH
RUN echo $TARGETARCH
RUN apk update && apk add bash ca-certificates git gcc g++ libc-dev librdkafka-dev pkgconf
WORKDIR "/code"
ADD . "/code"
RUN go build -tags musl -o main .
Just try making the following changes
FROM --platform=linux/$TARGETARCH golang:1.20-alpine3.17 as builder ARG TARGETARCH RUN echo $TARGETARCH RUN apk update && apk add bash ca-certificates git gcc g++ libc-dev librdkafka-dev pkgconf WORKDIR "/code" ADD . "/code" RUN go build -tags musl -o main .
This approach doesn't work with librdkafka-dev v2.3.0, but was working with v2.2.0
The root cause appears to be that librdkafka
now requires Cyrus SASL, but the confluent-kafka-go wrappers don't spell out a link dependency to it.
All the workarounds above seem to avoid solving this problem by instead installing a system librdkafka-dev
which requires -tags dynamic
per https://github.com/confluentinc/confluent-kafka-go/#librdkafka (not sure why earlier posted workaround examples work without it; we saw linker errors still).
To fix what I understand to be the root cause, we can:
cyrus-sasl-dev
(for Alpine, see librdkafka sasl docs for other platforms) is installed in the build and run environmentlibsasl2.so
I adapted the repro case from the original report for go1.21 + alpine3.18 with the requisite flags:
FROM --platform=linux/$TARGETARCH golang:1.21.4-alpine3.18
ARG TARGETARCH
RUN echo $TARGETARCH
RUN apk update
RUN apk add \
gcc \
musl-dev \
# explicitly install SASL package
cyrus-sasl-dev
WORKDIR "/code"
ADD . "/code"
RUN CGO_ENABLED=1 \
GO111MODULE=on \
GOOS=linux \
GOARCH=$TARGETARCH \
# explicitly link to libsasl2 installed as part of cyrus-sasl-dev
CGO_LDFLAGS="-lsasl2" \
go build -mod=vendor -o consumer_example -tags musl -ldflags "-w -s" .
This works on my arm64/M1 Mac for TARGETARCH
of both arm64
and amd64
.
As far as fixing the root cause bug; I'm not sure why there's now a hard link dependency on libsasl2.so. But I see that the Darwin cgo LDFLAGS have -lsasl2
as part of the distribution: https://github.com/confluentinc/confluent-kafka-go/blob/master/kafka/build_darwin_arm64.go#L9. There's probably reasons why this can't work on Linux in general, but it might be a thread to start pulling on.
The root cause appears to be that
librdkafka
now requires Cyrus SASL, but the confluent-kafka-go wrappers don't spell out a link dependency to it.All the workarounds above seem to avoid solving this problem by instead installing a system
librdkafka-dev
which requires-tags dynamic
per https://github.com/confluentinc/confluent-kafka-go/#librdkafka (not sure why earlier posted workaround examples work without it; we saw linker errors still).To fix what I understand to be the root cause, we can:
- Ensure
cyrus-sasl-dev
(for Alpine, see librdkafka sasl docs for other platforms) is installed in the build and run environment- Tell cgo to explicitly link
libsasl2.so
I adapted the repro case from the original report for go1.21 + alpine3.18 with the requisite flags:
FROM --platform=linux/$TARGETARCH golang:1.21.4-alpine3.18 ARG TARGETARCH RUN echo $TARGETARCH RUN apk update RUN apk add \ gcc \ musl-dev \ # explicitly install SASL package cyrus-sasl-dev WORKDIR "/code" ADD . "/code" RUN CGO_ENABLED=1 \ GO111MODULE=on \ GOOS=linux \ GOARCH=$TARGETARCH \ # explicitly link to libsasl2 installed as part of cyrus-sasl-dev CGO_LDFLAGS="-lsasl2" \ go build -mod=vendor -o consumer_example -tags musl -ldflags "-w -s" .
This works on my arm64/M1 Mac for
TARGETARCH
for botharm64
andamd64
.
Kim, this is very helpful! Thanks for the research. One may think, why my minimalist example is not a part of Confluent CI/CD pipeline as it can catch breaking changes.
It turns out the docs at https://github.com/confluentinc/librdkafka/wiki/Using-SASL-with-librdkafka#4-install-sasl-modules-on-client-host say:
Note: librdkafka must be built with SASL support (which is enabled by default if libsasl2-dev is installed at buildtime)
So I think what happened is that @emasab who built librdkafka for 2.3.0 happens to have Cyrus SASL/libsasl2 installed in their environment, and thereby confluent-kafka-go got an indirect dependency on the Cyrus SASL distribution.
I don't know anything about SASL, but it looks like librdkafka has minimal built-in support, so presumably earlier releases happened to build without the Cyrus dependency and only got the base support.
Followup: we actually ran into a problem with the proposed workaround -- CGO_LDFLAGS
are injected before the cgo LDFLAGS, and gcc -l
switches are sensitive to order (beautifully described here: https://eli.thegreenplace.net/2013/07/09/library-order-in-static-linking).
There's a supremely hacky way to work around this too, using a dangling -Wl,--start-group
before -lsasl2
;
CGO_LDFLAGS="-Wl,--start-group -lsasl2"
GCC complains with
bin/ld: missing --end-group; added as last command line option
but essentially fixes the unclosed group for you.
And as a final workaround tip: you can use a more modern linker which doesn't have the input order requirements: lld or mold.
Here's a Dockerfile to use mold
FROM --platform=linux/$TARGETARCH golang:1.21.4-alpine3.18
ARG TARGETARCH
RUN echo $TARGETARCH
RUN apk update
RUN apk add \
gcc \
# use mold for convenient extra linker inputs
mold \
musl-dev \
# explicitly install SASL package
cyrus-sasl-dev
WORKDIR "/code"
ADD . "/code"
RUN CGO_ENABLED=1 \
GO111MODULE=on \
GOOS=linux \
GOARCH=$TARGETARCH \
# explicitly link to libsasl2 installed as part of cyrus-sasl-dev
CGO_LDFLAGS="-fuse-ld=mold -lsasl2" \
go build -mod=vendor -o consumer_example -tags musl -ldflags "-w -s" .
This gets rid of the warning from gcc/ld about the unclosed group.
@kimgr appreaciate the detailed workarounds.
Unfortunately, the last one does not work for me.
It fails with the following error:
10.59 /usr/local/go/pkg/tool/linux_arm64/link: running aarch64-alpine-linux-musl-clang failed: exit status 1
10.59 mold: fatal: library not found: sasl2
I have cyrus-sasl-dev
installed.
(An extra piece of information: I use xx to cross-compile which may be an issue here)
Based on your earlier comment, however, this might be an issue with the bundled libs, so I'm thinking about building them myself, making sure cyrus-sasl-dev is not present.
If that is the problem, then I believe there should be a patch release fixing the libraries.
@sagikazarmark
I have cyrus-sasl-dev installed.
You mentioned xx. I'm not familiar with it, but I'm assuming you've installed cyrus-sasl-dev using xx-apk in the build context?
I wonder if a cross linker needs to be used too, or if you can somehow tell mold where to look for libraries for the target architecture.
Sorry, I don't have any clue, really.
Thank you all for raising awareness on this issue.
So I think what happened is that @emasab who built librdkafka for 2.3.0 happens to have Cyrus SASL/libsasl2 installed in their environment, and thereby confluent-kafka-go got an indirect dependency on the Cyrus SASL distribution.
That didn't happen because we configure and build these static binaries in a Semaphore pipeline, not on our laptops. Then we import those binaries locally to push them to confluent-kafka-go
.
I believe the issue is here in the release pipeline:
As it should be
if attr in a.info and \
a.info[attr] == m.attributes[origattr]:
because it's excluding the files the files that have the attribute extra=gssapi
.
Given it's not excluding them, depending on the order, the version with libsasl2 or the one without it could be taken.
That explains why the issue is present in 2.1.0 and 2.3.0 but not in 2.2.0 and 2.0.2. Going to create a PR to fix it before our upcoming 2.4.0 release.
Then we import those binaries locally to push them to confluent-kafka-go.
There's room for security improvements here. We have to make this step run on CI too.
v2.1.1-linux-arm64-musl
isn't affected either. But better to use the workaround at take latest fixes in 2.3.0 at the moment.
Confirmed that the only affected ones are these ones, by looking for rdkafka_sasl_cyrus.o
in archive files.
Raised this PR. And confirmed that the produced binaries don't include rdkafka_sasl_cyrus.o
, except for darwin
where it's expected to have it.
Closing this as it's fixed in 2.4.0
Description
ARM64 build using golang:1.20-alpine3.17 fails. AMD64 using confluent-kafka-go v2.1.0 build succeeds. ARM64 and AMD64 with v2.0.2 are also successful.
How to reproduce
ARG TARGETARCH RUN echo $TARGETARCH
RUN apk add alpine-sdk ca-certificates
WORKDIR "/code" ADD . "/code"
RUN CGO_ENABLED=1 GO111MODULE=on GOOS=linux GOARCH=$TARGETARCH go build -mod=vendor -o consumer_example -tags musl -ldflags "-w -s" .
go mod tidy && go mod vendor docker buildx build --build-arg TARGETARCH=arm64 .
go mod tidy && go mod vendor docker buildx build --build-arg TARGETARCH=amd64 .
require github.com/confluentinc/confluent-kafka-go/v2 v2.0.2