nim-lang / Nim

Nim is a statically typed compiled systems programming language. It combines successful concepts from mature languages like Python, Ada and Modula. Its design focuses on efficiency, expressiveness, and elegance (in that order of priority).
https://nim-lang.org
Other
16.54k stars 1.47k forks source link

httpclient -d:ssl and db_postgres incompatible and cause SIGSEGV! (MacOSX) #9419

Open treeform opened 6 years ago

treeform commented 6 years ago
import db_postgres
import httpclient

var client = newHttpClient()

discard client.getContent("https://google.com") # crashes here

var db = open("", "", "", "") # never gets to DB open (which should fail)

nim c -r -d:ssl server/crazyssl2

Traceback (most recent call last)
httpclient.nim(819)      crazyssl2
httpclient.nim(369)      getDefaultSSL
net.nim(538)             newContext
SIGSEGV: Illegal storage access. (Attempt to read from nil?)

nim -v

Nim Compiler Version 0.19.0 [MacOSX: amd64]
Compiled at 2018-10-02
Copyright (c) 2006-2018 by Andreas Rumpf

git hash: f673fbd91fa8b75e71feff0df232f6c598653722
active boot switches: -d:release
dom96 commented 6 years ago

I added a:

    if newCTX.isNil:
      raiseSSLError()

Just above net.nim if newCTX.SSLCTXSetCipherList(cipherList) != 1: and got a "error:140A90F1:lib(20):func(169):reason(241)" error. Preliminary search implies that this occurs when openssl is initialised multiple times, perhaps db_postgres is causing that to happen somehow?

treeform commented 6 years ago

Is there a way to check if openSSL was already initialized? I think it happens inside libpq.dylib code not mine.

dom96 commented 6 years ago

Try commenting out the init lines in the openssl (or wherever they are) module.

On Wed, 17 Oct 2018, 23:42 treeform, notifications@github.com wrote:

Is there a way to check if openSSL was already initialized? I think it happens inside libpq.dylib code not mine.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/nim-lang/Nim/issues/9419#issuecomment-430814749, or mute the thread https://github.com/notifications/unsubscribe-auth/AAPDe0cgciecgIgPzCwG-1MCyYY66ugpks5ul7JigaJpZM4Xph8c .

treeform commented 6 years ago

I already tried that commenting "SSL_library_init" does nothing. Same error.

LemonBoy commented 6 years ago

You may be interested in this.

treeform commented 6 years ago
proc pqinitOpenSSL*(do_ssl: int32, do_crypto: int32) {.cdecl, dynlib: dllName, importc: "PQinitOpenSSL".}
echo "calling pqinitOpenSSL"
pqinitOpenSSL(0, 0)

Added this in, compiles and runs but still fails in the same way...

I could try compiling libpg without SSL.

treeform commented 6 years ago

I figured it out! It looks like -d:ssl and libpq grab different libssl.dylibs.

I used otool to figure out what dylib libpq wants:

otool -L /usr/local/Cellar/libpq/10.5/lib/libpq.5.10.dylib
/usr/local/Cellar/libpq/10.5/lib/libpq.5.10.dylib:
        /usr/local/opt/libpq/lib/libpq.5.dylib (compatibility version 5.0.0, current version 5.10.0)
        /usr/local/opt/openssl/lib/libssl.1.0.0.dylib (compatibility version 1.0.0, current version 1.0.0)
        /usr/local/opt/openssl/lib/libcrypto.1.0.0.dylib (compatibility version 1.0.0, current version 1.0.0)
        /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1252.50.4)

Its strange but otool does not work on nim programs (does nim use some non-standard system for dylib includes?)

I did this to openssl.nim and it works:

# when useWinVersion:
#   when not defined(nimOldDlls) and defined(cpu64):
#     const
#       DLLSSLName* = "(libssl-1_1-x64|ssleay64|libssl64).dll"
#       DLLUtilName* = "(libcrypto-1_1-x64|libeay64).dll"
#   else:
#     const
#       DLLSSLName* = "(libssl-1_1|ssleay32|libssl32).dll"
#       DLLUtilName* =  "(libcrypto-1_1|libeay32).dll"

#   from winlean import SocketHandle
# else:
#   const versions = "(.1.1|.38|.39|.41|.43|.44|.45|.10|.1.0.2|.1.0.1|.1.0.0|.0.9.9|.0.9.8|)"

#   when defined(macosx):
#     const
#       DLLSSLName* = "libssl" & versions & ".dylib"
#       DLLUtilName* = "libcrypto" & versions & ".dylib"
#   elif defined(genode):
#     const
#       DLLSSLName* = "libssl.lib.so"
#       DLLUtilName* = "libcrypto.lib.so"
#   else:
#     const
#       DLLSSLName* = "libssl.so" & versions
#       DLLUtilName* = "libcrypto.so" & versions
#   from posix import SocketHandle

const
  DLLSSLName* = "/usr/local/opt/openssl/lib/libssl.1.0.0.dylib"
  DLLUtilName* = "/usr/local/opt/openssl/lib/libcrypto.1.0.0.dylib"
from posix import SocketHandle

I think the problem here is that const versions = "(.1.1|.38|.39|.41|.43|.44|.45|.10|.1.0.2|.1.0.1|.1.0.0|.0.9.9|.0.9.8|)" matches some other version of openSSL which is not what libpq selects.

I am not sure how to fix it in the source code though? Should libssl and libcrypot be a -d:ssl_version to be passed?

dom96 commented 6 years ago

huh damn. That's really tricky.

One thing I can think of is:

Problem: is it possible to query the DLLs that are loaded by a process? Can a handle be grabbed to such a process?

treeform commented 5 years ago

This 1 line patch fixes it: https://github.com/nim-lang/Nim/pull/10230 Turns out just searching them in order of newest to oldest fixes this issue.

timotheecour commented 5 years ago

reopening see https://github.com/nim-lang/Nim/issues/10281#issuecomment-453730543 for context; we need to find a proper fix that doens't introduce a regression

timotheecour commented 5 years ago

somewhat relevant: https://github.com/nim-lang/nimble#troubleshooting regarding SSL issues on OSX; but we need proper fix

hiteshjasani commented 5 years ago

Using choosenim with devel#head I'm getting a similar crash.

$ choosenim show
   Channel: devel
   Version: #head
[ ... ]
httpclient.nim(578)      getClient
httpclient.nim(365)      getDefaultSSL
net.nim(536)             newContext
SIGSEGV: Illegal storage access. (Attempt to read from nil?)

Something I learned, I have two versions of OpenSSL installed on MacOSX: 1.x and 1.1.x.

hiteshjasani commented 5 years ago

Okay, I have postgres and httpclient working with nimble build -d:ssl. I'll share what I did to help others.

when defined(windows):
  const
    dllName = "libpq.dll"
elif defined(macosx):
  const
    dllName = "libpq.dylib"
else:
  const
    dllName = "libpq.so(.5|)"

proc pqinitOpenSSL*(do_ssl: int32, do_crypto: int32) {.cdecl, dynlib: dllName, importc: "PQinitOpenSSL".}

proc main() =
  echo "Initting OpenSSL: " & $SSL_library_init()
  echo "Setting up db connection ..."
  pqinitOpenSSL(0, 0)           # Tell postgres not to init OpenSSL
  # ... opening database code
  # ... using httpclient to make http requests

I had both openssl and openssl@1.1 installed via Homebrew on the system. I didn't have either DYLD_LIBRARY_PATH or PKG_CONFIG_PATH defined. Using this method, everything is using openssl as it worked whether openssl@1.1 was installed or not - I tried it both ways.

My env:

$ sw_vers
ProductName:    Mac OS X
ProductVersion: 10.14.4
BuildVersion:   18E226
$ choosenim show
   Channel: devel
   Version: #head
      Path: /Users/hitesh/.choosenim/toolchains/nim-#head
treeform commented 5 years ago

I found that pqinitOpenSSL(0, 0) did not work for me. The only thing that worked is making sure both httpclient (nim stdlib) and libpq (installed via brew) used the same openssl dylib. For me I think it was the openssl 1.0.0 for you it might be some thing else. It all depends on what libpq was hard coded with when it was compiled. If they don't match you get this error.

hiteshjasani commented 5 years ago

@treeform that's strange. Because I was getting the same crash as you in my code before I managed this workaround. And my libpq has similar deps as your version.

otool -L /usr/local/Cellar/libpq/11.2/lib/libpq.5.11.dylib 
/usr/local/Cellar/libpq/11.2/lib/libpq.5.11.dylib:
    /usr/local/opt/libpq/lib/libpq.5.dylib (compatibility version 5.0.0, current version 5.11.0)
    /usr/local/opt/openssl/lib/libssl.1.0.0.dylib (compatibility version 1.0.0, current version 1.0.0)
    /usr/local/opt/openssl/lib/libcrypto.1.0.0.dylib (compatibility version 1.0.0, current version 1.0.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1252.200.5)

Okay, I tried your code and got the same crash as you. I modified to be this and it worked fine for me.

from openssl import SSL_library_init
import db_postgres, httpclient

when defined(windows):
  const
    dllName = "libpq.dll"
elif defined(macosx):
  const
    dllName = "libpq.dylib"
else:
  const
    dllName = "libpq.so(.5|)"

proc pqinitOpenSSL*(do_ssl: int32, do_crypto: int32) {.cdecl, dynlib: dllName, importc: "PQinitOpenSSL".}

proc main() =
  echo "Initting OpenSSL: " & $SSL_library_init()
  echo "Setting up db connection ..."
  pqinitOpenSSL(0, 0)           # Tell postgres not to init OpenSSL

  var client = newHttpClient()

  discard client.getContent("https://google.com")

  var db = open("", "", "", "") # fails because it can't connect, but no crash

when isMainModule:
  main()
Araq commented 5 years ago

How about we enforce --dynlibOverride for our SSL wrapper and always link "statically" to the lib*.so or whatever the correct terminology is for this clusterfuck.

hiteshjasani commented 5 years ago

Mixed feelings. The security community seems to oscillate between static and dynamically linking OpenSSL. But with how often vulnerabilities are being found and the frequency that OpenSSL is being updated, I'd lean towards keeping dynamic linking and making it more foolproof.

treeform commented 5 years ago

I think the real problem is that Nim just does not support all of the openSSL it claims it supports on all operating systems. So you have to find ssl that both pg and you will support.

This mostly fixes: https://github.com/nim-lang/Nim/pull/10230 - it tells nim to use the newest openSSL you can find.

But that fix runs into this issue: https://github.com/nim-lang/Nim/issues/10281#issuecomment-453730543 - here nim finds some newest openSSL that breaks it. Reverting my fix forces it to start looking for an old openSSL that works.

I think the reason for the revert Could not download: Please upgrade your OpenSSL library, it does not support the necessary protocols. OpenSSL error is: error:14077410:SSL routines:SSL23_GET_SERVER_HELLO:sslv3 alert handshake failure was because the testing machine had an openSSL version nim does not support (but says it does).

treeform commented 5 years ago

I created a spread sheet to track support and versions of different libssl. Turns it its too confusing!

https://docs.google.com/spreadsheets/d/1JgbQS5e2I5bEzFaVYbYrt714CoWGHj-gRIovd7MUfMw/edit?usp=sharing

I think we should drop support for openSSL 0.9.9 as it never shipped.

LibreSSL which is also compatible with openSSL goes by the same name but their version are really confusing. Its really hard to find mapping between libssl.##.so to LibreSSL version.

Its really unclear when any of these shipped, but they shipped before 4/11/2017:

libssl.38.so
libssl.39.so
libssl.10.so
libssl.41.so
libssl.43.so
libssl.44.so

We almost need a test case that can test all versions of openSSL/LibreSSL. Or drop older versions and prevent nim from compiling with them.

genotrance commented 4 years ago

This was already reported earlier in this tread and is the same root cause for issue 122 on the choosenim issue tracker.

choosenim loads libcurl and libssl/libcrypto on OSX using Nim's dynlib loading. libcurl itself is dynamically linked to libssl/libcrypto. If versions don't match, it crashes.

> otool -L /usr/lib/libcurl.dylib
        ...
        /usr/lib/libcrypto.42.dylib (compatibility version 43.0.0, current version 43.0.0)
        /usr/lib/libssl.44.dylib (compatibility version 45.0.0, current version 45.1.0)
        ...
genotrance commented 4 years ago

IRC discussion: https://irclogs.nim-lang.org/12-02-2020.html#17:04:09

genotrance commented 4 years ago

Consensus of the conversation was to link ssl/crypto at compile time instead of loading it at runtime. I tried this for choosenim though and it didn't work:

when defined(macosx):
  --define:curl
  --dynlibOverride:ssl
  --dynlibOverride:crypto
  switch("passL", "-lssl -lcrypto")

--define:ssl

Compiling with this results in a link time error - ld cannot find -lssl nor -lcrypto even though they are in /usr/lib.

We got lucky in that choosenim doesn't really need --define:ssl when libcurl is used to download on osx so we can simply change it to:

when defined(macosx):
  --define:curl
else:
  --define:ssl

This fixes the crash for choosenim but does not really provide a path forward for the general issue. Need help from others to figure out how to link against openssl in a traditional fashion on osx.

cc @alaviss @federico3 @Araq

treeform commented 4 years ago

My thoughts:

I fixed the order but it was reverted because it caused issues for @timotheecour at https://github.com/nim-lang/Nim/pull/10282. Because it was reverted that made me feel unwanted so I did not continue down this path. But I still believe the real solution lives there.

If you apply my patch again does it fix the issues you are having?

I made a sheet detailing every version: https://docs.google.com/spreadsheets/d/1JgbQS5e2I5bEzFaVYbYrt714CoWGHj-gRIovd7MUfMw/edit#gid=0

You can see on the sheet Nim claims to load version that never existed. Nim missed some. Nim loads openSSL out of order. It's not surprising we are having these OpenSSL issues because of this. Some other library is going i'll load the most recent openSSL, while nim goes I'll load a random openSSL user might have laying around.

timotheecour commented 4 years ago

Because it was reverted that made me feel unwanted so I did not continue down this path. But I still believe the real solution lives there.

very sorry to hear the revert made you feel unwanted; I greatly appreciate all your contributions; as I wrote here https://github.com/nim-lang/Nim/issues/10281#issuecomment-453730543 :

@treeform regression fixes always take precedence over bug fixes; #10230 might've fixed a bug but it introduced a regression [breaking nimble and every package], so it made sense to revert until we find a proper fix, and I only reverted the part that was problematic, OSX; I agree the existing code needs to be fixed properly, but priority was to fix the regression first.

derekdai commented 3 years ago

On Ubuntu 20.04, I have the same issue with nake

# nakefile.nim
import nake
import httpclient

task "default", "":
  newHttpClient().downloadFile("https://www.google.com", "index.html")
...
Nim/lib/pure/httpclient.nim(556) :anonymous
Nim/lib/pure/httpclient.nim(325) getDefaultSSL
Nim/lib/pure/net.nim(609) newContext
SIGSEGV: Illegal storage access. (Attempt to read from nil?)

Without nake, httpclient works fine

import httpclient

newHttpClient().downloadFile("https://www.google.com", "index.html")
treeform commented 3 years ago

I have written a HTTP library to solve this problem: https://github.com/treeform/puppy