JuliaLang / MbedTLS.jl

Wrapper around mbedtls
Other
41 stars 50 forks source link

Undefined symbol: mbedtls_x509_crt_verify_restartable #193

Closed maleadt closed 5 years ago

maleadt commented 5 years ago

I've been encountering the following on my GitLab CI while doing coverage submission on Julia 1.1:

julia: symbol lookup error: /builds/JuliaGPU/CUDAnative.jl/.julia/packages/MbedTLS/XkQiX/deps/usr/lib/libmbedtls.so: undefined symbol: mbedtls_x509_crt_verify_restartable

Full log: https://gitlab.com/JuliaGPU/CUDAnative.jl/-/jobs/151716689 -- it has worked before on 1.1 though, see eg. https://gitlab.com/JuliaGPU/CUDAnative.jl/-/jobs/149248223, but quite some packages seem to have been upgraded since (HTTP.jl, Coverage.jl, etc).

This is on a very basic Ubuntu 18.04 image with Julia binaries installed: https://github.com/JuliaGPU/gitlab-ci/blob/master/images/base/v1.1/Dockerfile

rschwarz commented 5 years ago

I have the same issue (see travis log) based on Ubuntu Xenial (16.04) offered by Travis.

DilumAluthge commented 5 years ago

I have the same error (full log) on Travis.

This is the line with the error:

julia: symbol lookup error: /home/travis/.julia/packages/MbedTLS/XkQiX/deps/usr/lib/libmbedtls.so: undefined symbol: mbedtls_x509_crt_verify_restartable

These are the lines immediately preceding:

Coverage.process_file: Detecting coverage for src/version.jl
Coverage.process_folder: Skipping version.jl.3853.cov, not a .jl file
Coverage.process_file: Detecting coverage for src/welcome.jl
Coverage.process_folder: Skipping welcome.jl.3853.cov, not a .jl file
Codecov.io API URL:
https://codecov.io/upload/v2?&service=travis-org&branch=master&commit=f3ec60c9a438ab1c3a9e51b750f44f288452b24f&pull_request=false&job=173244732&slug=UnofficialJuliaMirror/MirrorUpdater.jl&build=259.22

Version information:

curtd commented 5 years ago

I have the same error (full log) on Travis.

This is the line with the error:

julia: symbol lookup error: /home/travis/.julia/packages/MbedTLS/XkQiX/deps/usr/lib/libmbedtls.so: undefined symbol: mbedtls_x509_crt_verify_restartable

These are the lines immediately preceding:

Coverage.process_file: Detecting coverage for src/version.jl
Coverage.process_folder: Skipping version.jl.3853.cov, not a .jl file
Coverage.process_file: Detecting coverage for src/welcome.jl
Coverage.process_folder: Skipping welcome.jl.3853.cov, not a .jl file
Codecov.io API URL:
https://codecov.io/upload/v2?&service=travis-org&branch=master&commit=f3ec60c9a438ab1c3a9e51b750f44f288452b24f&pull_request=false&job=173244732&slug=UnofficialJuliaMirror/MirrorUpdater.jl&build=259.22

Version information:

  • Julia Version 1.0.3
  • MbedTLS.jl v0.6.7
  • Coverage.jl v0.6.0

I noticed in my own testing that HTTP.jl uses MbedTLS version 0.6.0 in its Project.toml. Manually updating this to 0.6.7 and rebuilding the package seemed to alleviate the problem.

tkoolen commented 5 years ago

I noticed in my own testing that HTTP.jl uses MbedTLS version 0.6.0 in its Project.toml. Manually updating this to 0.6.7 and rebuilding the package seemed to alleviate the problem.

What do you mean by alleviate? I just tried this: https://github.com/JuliaWeb/HTTP.jl/commit/342d9c26bdf5f39caacc803b40020d352f3ab53d, and after a pkg> build, pkg> test HTTP still fails with the error message in the issue description.

rofinn commented 5 years ago

Not sure if this is related, but on 0.6.7 I appear to be getting:

libmbedtls.so: undefined symbol: mbedtls_pk_verify_restartable

I seem to be getting the

undefined symbol: mbedtls_x509_crt_verify_restartable

error from the HTTP (0.8.0) tests with on both MbedTLS (0.6.7)

NOTE: I only seem to be getting this inside an amazon linux 2 docker container (macOS seems fine), but for now downgrading to 0.6.6 seems to work.

curtd commented 5 years ago

I noticed in my own testing that HTTP.jl uses MbedTLS version 0.6.0 in its Project.toml. Manually updating this to 0.6.7 and rebuilding the package seemed to alleviate the problem.

What do you mean by alleviate? I just tried this: JuliaWeb/HTTP.jl@342d9c2, and after a pkg> build, pkg> test HTTP still fails with the error message in the issue description.

My mistake, updating MbedTLS to v"0.6.7" fixed the issue my package was having with HTTP.request() but the tests themselves are still broken

tkoolen commented 5 years ago

I believe I know what's wrong. The deps.jl file generated by BinaryProvider looks like this:

## This file autogenerated by BinaryProvider.write_deps_file().
## Do not edit.
##
## Include this file within your main top-level source, and call
## `check_deps()` from within your module's `__init__()` method

if isdefined((@static VERSION < v"0.7.0-DEV.484" ? current_module() : @__MODULE__), :Compat)
    import Compat.Libdl
elseif VERSION >= v"0.7.0-DEV.3382"
    import Libdl
end
const libmbedcrypto = joinpath(dirname(@__FILE__), "usr/lib/libmbedcrypto.2.16.0.dylib")
const libmbedtls = joinpath(dirname(@__FILE__), "usr/lib/libmbedtls.12.dylib")
const libmbedx509 = joinpath(dirname(@__FILE__), "usr/lib/libmbedx509.0.dylib")
function check_deps()
    global libmbedcrypto
    if !isfile(libmbedcrypto)
        error("$(libmbedcrypto) does not exist, Please re-run Pkg.build(\"MbedTLS\"), and restart Julia.")
    end

    if Libdl.dlopen_e(libmbedcrypto) in (C_NULL, nothing)
        error("$(libmbedcrypto) cannot be opened, Please re-run Pkg.build(\"MbedTLS\"), and restart Julia.")
    end

    global libmbedtls
    if !isfile(libmbedtls)
        error("$(libmbedtls) does not exist, Please re-run Pkg.build(\"MbedTLS\"), and restart Julia.")
    end

    if Libdl.dlopen_e(libmbedtls) in (C_NULL, nothing)
        error("$(libmbedtls) cannot be opened, Please re-run Pkg.build(\"MbedTLS\"), and restart Julia.")
    end

    global libmbedx509
    if !isfile(libmbedx509)
        error("$(libmbedx509) does not exist, Please re-run Pkg.build(\"MbedTLS\"), and restart Julia.")
    end

    if Libdl.dlopen_e(libmbedx509) in (C_NULL, nothing)
        error("$(libmbedx509) cannot be opened, Please re-run Pkg.build(\"MbedTLS\"), and restart Julia.")
    end

end

So it first opens libmbedcrypto, then libmbedtls, then libmbedx509. However, the mbedtls readme states:

[...] when loading shared libraries using dlopen(), you'll need to load libmbedcrypto first, then libmbedx509, before you can load libmbedtls.

This seems plausible given that mbedtls_x509_crt_verify_restartable is a symbol in libmbedx509.so.

tkoolen commented 5 years ago

Unfortunately, HTTP.jl tests still fail for me after changing the order locally. Still I think it is a real issue.

rschwarz commented 5 years ago

I thought that it was likely a version issue, since the problem did not occur on Travis/Debian, but Travis/Ubuntu. (I did not actually retry Debian recently, it's just that I remember it to work.)

tkoolen commented 5 years ago

The latest release of MbedTLS upgraded to a new version. Julia ships with a libmbedtls.so that's older, and on my system those are the only libmbedtls.so copies. Also the libmbedx509.so / libmbedtls.so that ship with Julia don't have mbedtls_x509_crt_verify_restartable. So I think it's likely it's somehow using the Julia-shipped libs, but I don't quite see why yet, nor why it would be platform-specific (HTTP.jl tests also pass on OSX) or distribution-specific.

giordano commented 5 years ago

This bug doesn't allow me to automatically setup Travis in BinaryBuilder.jl. System used: Arch Linux. As a temporary workaround I downgraded this package to v0.6.6.

DilumAluthge commented 5 years ago

Any update from a package maintainer?

@quinnj

On Tue, Feb 5, 2019 at 19:24 Mosè Giordano notifications@github.com wrote:

This bug doesn't allow me to automatically setup Travis in BinaryBuilder.jl. System used: Arch Linux. As a temporary workaround I downgraded this package to v0.6.6.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JuliaWeb/MbedTLS.jl/issues/193#issuecomment-460858002, or mute the thread https://github.com/notifications/unsubscribe-auth/AFXAraVo-L0GOcGtbcY8Riz90uzwcx3mks5vKiCegaJpZM4aUniZ .

quinnj commented 5 years ago

Sorry for the slow response here, but I don't really have much to contribute; I haven't personally had any issues on OSX Julia 1.1, or Linux Julia 1.0/1.1; do we have a reliable reproduction somewhere where we could debug what's going on?

DilumAluthge commented 5 years ago

Yep, it happens on my Travis builds. Here is an example. https://travis-ci.com/UnofficialJuliaMirror/MirrorUpdater.jl/jobs/175298500

On Wed, Feb 6, 2019 at 15:27 Jacob Quinn notifications@github.com wrote:

Sorry for the slow response here, but I don't really have much to contribute; I haven't personally had any issues on OSX Julia 1.1, or Linux Julia 1.0/1.1; do we have a reliable reproduction somewhere where we could debug what's going on?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JuliaWeb/MbedTLS.jl/issues/193#issuecomment-461174908, or mute the thread https://github.com/notifications/unsubscribe-auth/AFXArcJDW3cH5NFsrsm0RXr-224dm5vjks5vKzrKgaJpZM4aUniZ .

staticfloat commented 5 years ago

@quinnj I think this is due to the fact that libmbedtls 2.16 doesn't work with Julia 1.1 (which is built with libmbedtls 2.6 by default), but 2.13 does; so if you replace deps/build.jl with this one, then re-run pkg> build MbedTLS, that fixes everything for me. So perhaps we need to roll back to libmbedtls 2.13 in this package, or add compatibility bounds or something.

quinnj commented 5 years ago

Why wouldn't 2.16 work w/ Julia 1.1? Like I mentioned, I'm running 1.1 on both OSX and Linux and haven't had any issues w/ 2.16.

tkoolen commented 5 years ago

@quinnj, which Linux distro are you using? How are you verifying that there are no issues? For me, running HTTP.jl tests on Ubuntu 18.04.1 LTS fails with the error in the issue description.

quinnj commented 5 years ago

Debian stretch-slim, in a docker container. I'm running a production web server and haven't had any MbedTLS.jl related issues. (been running for ~2 weeks now)

staticfloat commented 5 years ago

@quinnj Are you using the official binaries, or building locally? When you build locally do you use USE_SYSTEM_MBEDTLS=1?

quinnj commented 5 years ago

Official binaries. I haven't built julia locally in several months.

tkoolen commented 5 years ago

This earlier comment also reports things to be working on Debian but not Ubuntu.

giordano commented 5 years ago

I should try again, but I'm pretty sure that it doesn't work on my Debian testing machine

HarrisonGrodin commented 5 years ago

This is consistently reproduced on Travis for ModelingToolkit on Julia v1.0/v1.1 (but curiously, not on nightly):

https://travis-ci.org/JuliaDiffEq/ModelingToolkit.jl/builds/489842845

epatters commented 5 years ago

For the record, the issue is not specific to Julia v1. I'm getting the same error on Travis with Julia v0.7 after upgrading to MbedTLS v0.6.7.

https://travis-ci.org/epatters/semanticflowgraph/builds/490281690

tkoolen commented 5 years ago

but curiously, not on nightly

Note that Julia nightly ships with mbedtls 2.16.0 (same version as the one that MbedTLS.jl 0.6.7 upgraded to) as of https://github.com/JuliaLang/julia/pull/30618.

quinnj commented 5 years ago

@staticfloat , what would you suggest here? It's not really ideal to roll back to 0.2.13; do you have any idea what the issue would be here? If we have to roll back, we can, I'd just rather figure out what's going on instead.

briochemc commented 5 years ago

👍 Same issue here on two different Linux machines:

julia: symbol lookup error: /home/MY_USERNAME/.julia/packages/MbedTLS/XkQiX/deps/usr/lib/libmbedtls.so: undefined symbol: mbedtls_x509_crt_verify_restartable
davidanthoff commented 5 years ago

Bump, I also just ran into this on Ubuntu.

I would suggest to roll back things for now, until someone finds the time to dig into this and figure out what is happening. This is clearly affecting a fair number of systems, and I think a package that is so low in the dependency tree really just needs to work and can't have such extended broken phases.

quinnj commented 5 years ago

I'm just not sure that would actually solve anything; how do we know this isn't because of what Base Julia has done/upgraded that's now causing issues? As I've stated before, when I setup a clean linux machine or docker image and use the official Julia 1.1 binary, I don't see any issues. I'll ping @staticfloat again to see what he suggests.

quinnj commented 5 years ago

Update for those here: I was able to reproduce the issue finally. The issue I was seeing was that having the MbedTLS package devved locally, even when the build script was updated to latest master and "built", it wasn't replacing the old 0.2.13 library versions I had (I guess BinaryBuilder was satisfied?). Once I completely removed all old versions of the package from my system and re-installed, I was able to reproduce. It does seem to be linux-only from what I can tell, and doing nm libmbedtls confirms the undefined symbol. I'm attempting to rebuild the binaries from MbedTLSBuilder through a new release to see if someone the libraries just got messed up somehow last time; if the issue persists, I'll make a new release going back to the previous version.

quinnj commented 5 years ago

Looks like the builder job itself is now showing the error: https://travis-ci.org/JuliaWeb/MbedTLSBuilder/jobs/495825287

thofma commented 5 years ago

So we are stuck in an endless circle of symbol lookup errors?

staticfloat commented 5 years ago

Looks like the builder job itself is now showing the error:

For the time being, you can pin an older MbedTLS.jl version on Travis so that you don't run into that error perhaps?

SabineAuer commented 5 years ago

Could you assist me in what to add in the travis.yml file to pin an older version on travis? That would be really helpful! Thanks!

visr commented 5 years ago

@SabineAuer see https://github.com/JuliaWeb/MbedTLSBuilder/pull/17

SabineAuer commented 5 years ago

@visr Thanks a lot, that fixed the problem with the travis CI for me temporarily as well!

tanmaykm commented 5 years ago

bump! got hit by this, had to downgrade

visr commented 5 years ago

@tanmaykm all that needs to happen is merge and tag a few things. Since I see you are a JuliaWeb member, you may be able to help.

  1. Merge https://github.com/JuliaWeb/MbedTLSBuilder/pull/17
  2. Tag a new MbedTLSBuilder release, wait for Travis to upload all binaries and build.jl script to the release.
  3. Download new build.jl script from MbedTLSBuilder and use in this repo.
  4. Merge https://github.com/JuliaWeb/MbedTLS.jl/pull/194/commits/f51e810636ef6b1925be9b04c793ba96aff12034
  5. Tag a new MbedTLS.jl

Or if you want to take a shortcut I guess you can also just merge https://github.com/JuliaWeb/MbedTLS.jl/pull/194 (both commits) and tag.

tanmaykm commented 5 years ago

I still get this error even if I apply these changes to MbedTLS.jl. The PRs referred to here seem to be valid. But the error that I am facing seems to be due to incompatibility between shared libraries shipped with Julia and MbedTLS.jl.

Symbol present in lib shipped with MbedTLS.jl:

~/.julia/dev/MbedTLS/deps/usr/lib$ nm libmbedx509.so | grep mbedtls_x509_crt_verify_restartable
000000000000c5b9 T mbedtls_x509_crt_verify_restartable

But not there in lib shipped with Julia:

/data/Work/julia/binaries/julia-1.1.0/lib/julia$ nm libmbedx509.so | grep mbedtls_x509_crt_verify_restartable

And I get this error from within Pkg in Julia stdlib:

julia> using Pkg;
julia> Pkg.activate(".");
julia> Pkg.test("HTTP");
...
Running WebSockets.jl tests...
/data/Work/julia/binaries/julia-1.1.0/bin/julia: symbol lookup error: /home/tan/.julia/dev/MbedTLS/deps/usr/lib/libmbedtls.so: undefined symbol: mbedtls_x509_crt_verify_restartable
ERROR: Package HTTP errored during testing
Stacktrace:
 [1] pkgerror(::String, ::Vararg{String,N} where N) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Pkg/src/Types.jl:120
 [2] #test#66(::Bool, ::Function, ::Pkg.Types.Context, ::Array{Pkg.Types.PackageSpec,1}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Pkg/src/Operations.jl:1328
 [3] #test at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:0 [inlined]
 [4] #test#44(::Bool, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Pkg.Types.Context, ::Array{Pkg.Types.PackageSpec,1}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:193
 [5] test at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:178 [inlined]
 [6] #test#43 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:175 [inlined]
 [7] test at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:175 [inlined]
 [8] #test#42 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:174 [inlined]
 [9] test at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:174 [inlined]
 [10] #test#41(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::String) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:173
 [11] test(::String) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:173
 [12] top-level scope at none:0
martinholters commented 5 years ago

Just to verify: There are systems (Windows and macOS?) which are not affected by this issue? Otherwise we could bump the required Julia version of the latest MbedTLS to 1.2- in METADATA to prevent it from being installed on incompatible Julia versions.

vtjnash commented 5 years ago

bump. This is preventing us from getting information out of Coverage.jl: https://travis-ci.org/JuliaCI/CoverageBase.jl/jobs/501818500

fingolfin commented 5 years ago

This issue has been open for over a month, and apparently there is no fix in sight (at least looking in from the outside; I'd be happy to be wrong here!)

Since this is breaking tests for dozens (hundreds?) of projects, wouldn't it be better to revert the breaking change for now, make a new release, and then work on fixing the issue properly, but with all the time you need?

ronisbr commented 5 years ago

This issue has been open for over a month, and apparently there is no fix in sight (at least looking in from the outside; I'd be happy to be wrong here!)

Since this is breaking tests for dozens (hundreds?) of projects, wouldn't it be better to revert the breaking change for now, make a new release, and then work on fixing the issue properly, but with all the time you need?

I second that. I had to pin MbedTLS.jl to v0.6.6 to fix the issue for now.

tkoolen commented 5 years ago

Alright, how about this: explicitly dlopen the libs (in the right order) and change all the ccalls to explicitly use the handles. At least that seems to me to be a straightforward fix.

quinnj commented 5 years ago

@fingolfin, @ronisbr, please read the other comments in this thread. There's no consensus that just "reverting" the latest release actually resolves the issue (see @tanmaykm's comment and my own). At this point, I suspect that we're getting a weird interaction with the Julia Base-shipped mbedtls shared libraries conflicting with the MbedTLS.jl-shipped ones and that's causing the issues. I've spent at least 3-4 days trying to track down various root causes, but shared-library loading/symbol resolving is far from my skillset. If someone can show concrete evidence of an mbedtls library version that works on Julia 1.0 and 1.1 without issues, I'm happy to make changes and merge PRs, but so far, AFAIU, we haven't seen that kind of 100% solution.

@tkoolen, similarly, if we can show that your suggestions solve the issue here in all cases, then I'm happy to make changes. There just seems to be a lot of "hey, we should do this!" or "just revert!" comments here without anyone actually going thru the work of seeing if that solves everything.

vtjnash commented 5 years ago

Here's a simple quick reproducer (from inside a Julia v1.0 build):

julia> using Libdl

julia> using HTTP

julia> Libdl.dlopen("libmbedx509.so")
Ptr{Nothing} @0x000055f064544560

julia> HTTP.get("https://google.com")
./julia: symbol lookup error: /home/vtjnash/.julia/packages/MbedTLS/XkQiX/deps/usr/lib/libmbedtls.so: undefined symbol: mbedtls_x509_crt_verify_restartable

...and at this point the process is now killed also.

vtjnash commented 5 years ago

Also, for a debugging trick, set export LD_BIND_NOW=1. This'll cause the errors to surface as assertion failures much sooner. Than we can, for example, just do this minimal demo:

julia> using Libdl

julia> Libdl.dlopen("libmbedtls.so")
Ptr{Nothing} @0x000055e22e1c1830

julia> Libdl.dlopen("/home/vtjnash/.julia/packages/MbedTLS/XkQiX/deps/usr/lib/libmbedtls.so")
ERROR: could not load library "/home/vtjnash/.julia/packages/MbedTLS/XkQiX/deps/usr/lib/libmbedtls.so"
/home/vtjnash/.julia/packages/MbedTLS/XkQiX/deps/usr/lib/libmbedtls.so: undefined symbol: mbedtls_x509_crt_verify_restartable
Stacktrace:
 [1] dlopen(::String, ::UInt32) at /data/vtjnash/julia10/usr/share/julia/stdlib/v1.0/Libdl/src/Libdl.jl:97 (repeats 2 times)
 [2] top-level scope at none:0
ronisbr commented 5 years ago

This bug was filled on the same day v0.6.7 was released and no one has reported that reverting back to v0.6.6 did not fix the issue. Hence, there is a strong correlation. Of course I cannot know if this fixes the issues for everyone, but it did fix in all repositories I tested so far. The previous version can still have problems but those seem at least more sparse than the current one.

davidanthoff commented 5 years ago

In particular, I read @tanmaykm's comment as saying that reverting did fix things for him.

quinnj commented 5 years ago

@staticfloat and I had a good debugging session today and have a pretty good handle on the root cause (julia-shipped mbedtls binaries interfering w/ MbedTLS.jl binaries). I'll try to get things cleaned up and get a new release out soon.