ndmitchell / hoogle

Haskell API search engine
http://hoogle.haskell.org/
Other
738 stars 134 forks source link

wget (used by hoogle data) fails to accept haskell.org certificates #92

Closed Blaisorblade closed 9 years ago

Blaisorblade commented 9 years ago

hoogle data currently fails for me. This boils down to wget refusing haskell.org certificate, claiming that it's expired (which it isn't), while both Chrome and curl are rather happy with the certificate.

I am not convinced this is the correct forum to report this, but I don't know the right one and I spent too much trying to investigate this myself. You might want to consider integrating the download rather than using wget, also because wget is not even available by default on systems like OS X and Linux.

More details below.

Wireshark tells me the downloaded certificate is valid between 14-11-02 13:57:58 (UTC) and 15-10-15 19:48:36 (UTC), so I don't believe wget is making any sense. Googling reveals many people complaining about problems with subject names, but that seems different enough.

$ hoogle data
Downloading downloads/platform.cabal
# platform.cabal (for downloads/platform.cabal)
2014-11-09 18:43:46 URL:http://code.galois.com/darcs/haskell-platform/haskell-platform.cabal [3334/3334] -> "downloads/platform.cabal" [1]
Downloaded downloads/platform.cabal
Downloading downloads/cabal.tar.gz
# cabal.tar.gz (for downloads/cabal.tar.gz)
2014-11-09 18:43:55 URL:http://hackage.haskell.org/packages/index.tar.gz [8064272/8064272] -> "downloads/cabal.tar.gz" [1]
Downloaded downloads/cabal.tar.gz
# cabal.tar (for downloads/cabal.tar)
Extracting tar file downloads/cabal.index
# cabal.tar (for downloads/cabal.index)
Finished extracting tar file downloads/cabal.index
# cabal.tar (for downloads/cabal.index)
Downloading downloads/hoogle.tar.gz
# hoogle.tar.gz (for downloads/hoogle.tar.gz)
2014-11-09 18:44:44 URL:http://hackage.haskell.org/packages/hoogle.tar.gz [18532709/18532709] -> "downloads/hoogle.tar.gz" [1]
Downloaded downloads/hoogle.tar.gz
# hoogle.tar (for downloads/hoogle.tar)
Extracting tar file downloads/hoogle.index
# hoogle.tar (for downloads/hoogle.index)
Finished extracting tar file downloads/hoogle.index
# hoogle.tar (for downloads/hoogle.index)
Downloading downloads/base.txt
# base.txt (for downloads/base.txt)
ERROR: cannot verify www.haskell.org's certificate, issued by '/C=BE/O=GlobalSign nv-sa/CN=GlobalSign Organization Validation CA - G2':
  Issued certificate has expired.
To connect to www.haskell.org insecurely, use `--no-check-certificate'.
hoogle: Error when running Shake build system:
* default.hoo
* platform.hoo
* mtl.hoo
* base.txt
* downloads/base.txt.cache
* downloads/base.txt
Development.Shake.command, system command failed
Command: wget -nv http://www.haskell.org/hoogle/base.txt --output-document=downloads/base.txt
Exit code: 5
Stderr:
ERROR: cannot verify www.haskell.org's certificate, issued by '/C=BE/O=GlobalSign nv-sa/CN=GlobalSign Organization Validation CA - G2':
  Issued certificate has expired.
To connect to www.haskell.org insecurely, use `--no-check-certificate'.

Detailed wget message:

$ wget --debug -S --verbose https://www.haskell.org/hoogle/base.txt
Setting --server-response (serverresponse) to 1
Setting --verbose (verbose) to 1
DEBUG output created by Wget 1.16 on darwin14.0.0.

--2014-11-09 18:42:25--  https://www.haskell.org/hoogle/base.txt
Resolving www.haskell.org... 108.162.204.60, 108.162.203.60, 2400:cb00:2048:1::6ca2:cb3c, ...
Caching www.haskell.org => 108.162.204.60 108.162.203.60 2400:cb00:2048:1::6ca2:cb3c 2400:cb00:2048:1::6ca2:cc3c
Connecting to www.haskell.org|108.162.204.60|:443... connected.
Created socket 5.
Releasing 0x00007fd699d28780 (new refcount 1).
Initiating SSL handshake.
Handshake successful; connected socket 5 to SSL handle 0x00007fd699e00590
certificate:
  subject: /C=US/ST=CA/L=San Francisco/O=CloudFlare, Inc./CN=ssl6957.cloudflare.com
  issuer:  /C=BE/O=GlobalSign nv-sa/CN=GlobalSign Organization Validation CA - G2
ERROR: cannot verify www.haskell.org's certificate, issued by '/C=BE/O=GlobalSign nv-sa/CN=GlobalSign Organization Validation CA - G2':
  Issued certificate has expired.
To connect to www.haskell.org insecurely, use `--no-check-certificate'.
Closed 5/SSL 0x00007fd699e00590

This seems to be dependent on the wget being used — I've reproduced the problem on OS X Yosemite with Homebrew's wget 1.16 (which seems vanilla) and openssl 1.0.1j, not on Linux with Ubuntu Trusty.

Blaisorblade commented 9 years ago

(As a workaround, I forced using --no-check-certificate with a bash wrapper for wget, so at least the immediate problem is solved for me).

ndmitchell commented 9 years ago

Thanks for the report, this is indeed the correct place to report - I'll make that clearer somewhere.

I think moving to use the Haskell HTTP library is the right choice. Not entirely sure why I didn't do that initially (I think there was some issue), but it's almost certainly simpler with fewer dependencies.

Blaisorblade commented 9 years ago

Thanks for the report, this is indeed the correct place to report - I'll make that clearer somewhere.

Thanks a lot for your answer! I've seen #91, I was just unsure that I should report this as a hoogle issue, since hoogle's only fault is depending on wget, and the problem might lie with the SSL library or with whatever else. But I agree that an Haskell implementation would probably be a good idea.

I think moving to use the Haskell HTTP library is the right choice. Not entirely sure why I didn't do that initially (I think there was some issue), but it's almost certainly simpler with fewer dependencies.

I'm no expert on Haskell HTTP libraries (except for the last 5 minutes), and that makes sense to me, but haskell.org forces HTTPS on you, which sounds potentially useful in light of #78. However, HTTP does not support that.

So you're left with

I'd hope hoogle needn't deal with this complexity. Refusing to run as root would be one mitigation, but I'm not sure what else can be done. Also, probably using tls would be still be useful as a mitigation over plain HTTP, it might just be worse than wget or curl.

crisoagf commented 9 years ago

As far as I understand, the problem is that wiki.haskell.org doesn't provide the whole ssl certification chain, as openssl (which wget uses) expects (see this stackoverflow entry).

And, also as far as I understand, the current interpretation is that this isn't a bug in openssl, rather than a missing (non-essential) feature.

Blaisorblade commented 9 years ago

@crisoagf : thanks for your answer! I'm no expert, but it sounds like, by Postel's law "Be conservative in what you send, be liberal in what you accept", www.haskell.org might want to send the whole certificate chain? From that StackOverflow answer it seems that that's the standard thing to do, and it would be more compatible.

wiki.haskell.org

I guess that's a typo, but can you confirm that www.haskell.org (or whatever server hoogle data fails to access) has this behavior?

crisoagf commented 9 years ago

Ok, I don't know if I should file a different bug report, but, on my workstation a bug (which appears quite similar to this one, and it is the reason why I mistook one for the other) only occurs when hoogle data tries to get data from the haskell wiki (namely, from www.haskell.org/haskellwiki/Keywords, that redirects to wiki.haskell.org, where the chain error occurs), maybe you could verify if the bug still occurs with you in the same fashion (wget --debug -S --verbose https://www.haskell.org/hoogle/base.txt works fine with me).

Perhaps originally the www.haskell.org certificate had really some sort of fault and this issue is solved now?

Should I file a new bug?

EDIT: little explanation to why I mentioned wiki.haskell.org. EDIT2: haskell.org's certificate has never expired

Blaisorblade commented 9 years ago

Should I file a new bug?

No clue whether the solutions are separate (I'm just a hoogle user). But it might help to have a complete bug report anyway. And maybe your issue is fixable by "simply" changing the wiki setup — which is easier than other changes we've been discussing here.

For me, wget --debug -S --verbose https://www.haskell.org/hoogle/base.txt doesn't work yet, with essentially the same error as before. My wget doesn't like https://wiki.haskell.org/Keywords either, but with a different error.

Setting --server-response (serverresponse) to 1
Setting --verbose (verbose) to 1
DEBUG output created by Wget 1.16.1 on darwin14.0.0.

--2015-01-26 16:37:51--  https://www.haskell.org/hoogle/base.txt
Resolving www.haskell.org... 23.253.242.70, 2400:cb00:2048:1::6ca2:cc3c, 2400:cb00:2048:1::6ca2:cb3c
Caching www.haskell.org => 23.253.242.70 2400:cb00:2048:1::6ca2:cc3c 2400:cb00:2048:1::6ca2:cb3c
Connecting to www.haskell.org|23.253.242.70|:443... connected.
Created socket 5.
Releasing 0x00007fd3b0525bc0 (new refcount 1).
Initiating SSL handshake.
Handshake successful; connected socket 5 to SSL handle 0x00007fd3b04dc7c0
certificate:
  subject: CN=*.haskell.org,OU=Domain Control Validated
  issuer:  CN=GlobalSign Domain Validation CA - SHA256 - G2,O=GlobalSign nv-sa,C=BE
ERROR: cannot verify www.haskell.org's certificate, issued by 'CN=GlobalSign Domain Validation CA - SHA256 - G2,O=GlobalSign nv-sa,C=BE':
  Issued certificate has expired.
To connect to www.haskell.org insecurely, use `--no-check-certificate'.
Closed 5/SSL 0x00007fd3b04dc7c0

The query suggested by stackoverflow on www.haskell.org gives a different problem — but I'm not sure where's my certificate store, so maybe it'd be happy with a properly configured store.

$ openssl s_client -connect www.haskell.org:443
CONNECTED(00000003)
depth=2 /C=BE/O=GlobalSign nv-sa/OU=Root CA/CN=GlobalSign Root CA
verify error:num=19:self signed certificate in certificate chain
verify return:0
---
Certificate chain
 0 s:/OU=Domain Control Validated/CN=*.haskell.org
   i:/C=BE/O=GlobalSign nv-sa/CN=GlobalSign Domain Validation CA - SHA256 - G2
 1 s:/C=BE/O=GlobalSign nv-sa/CN=GlobalSign Domain Validation CA - SHA256 - G2
   i:/C=BE/O=GlobalSign nv-sa/OU=Root CA/CN=GlobalSign Root CA
 2 s:/C=BE/O=GlobalSign nv-sa/OU=Root CA/CN=GlobalSign Root CA
   i:/C=BE/O=GlobalSign nv-sa/OU=Root CA/CN=GlobalSign Root CA
---
Server certificate
-----BEGIN CERTIFICATE-----
MIIFMjCCBBqgAwIBAgISESHf9sIzriSJsUica80pzcLLMA0GCSqGSIb3DQEBCwUA
MGAxCzAJBgNVBAYTAkJFMRkwFwYDVQQKExBHbG9iYWxTaWduIG52LXNhMTYwNAYD
VQQDEy1HbG9iYWxTaWduIERvbWFpbiBWYWxpZGF0aW9uIENBIC0gU0hBMjU2IC0g
RzIwHhcNMTQxMDE1MDQxOTQ4WhcNMTUxMTE1MDYyODEwWjA7MSEwHwYDVQQLExhE
b21haW4gQ29udHJvbCBWYWxpZGF0ZWQxFjAUBgNVBAMMDSouaGFza2VsbC5vcmcw
ggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDENZbesvZzJd9ND4pLQWjj
o7+OBDcDMkyVjSPaPiMyAkwar/1KIw7DSDnlFF06fpc9ZWTJA0+jQZTdAMPTtCfH
W+0gcSzhUlqUfMXKYARMP32wNNhzg0BU8PuF8rvBnI4zFl/UgJUN3wtj0HoZzjPP
AOgJyDdRifuVmNbdML11xBk98u9xXO7eFe0OBdelf7WNVZ4gDgIbCoqHUsSn6R5q
EzStdYUxk7dfWvhEicRTKk72zzSgIsKLA7PBk/LnJM5mk3qR5rKAtHJFA8FAvJDt
0JN6KakWJWkrlMO1KbFyVDKRzxRK3VHmEzzKmljdIHV3E7C6QM5NRFO6Qg0VVZAR
AgMBAAGjggIJMIICBTAOBgNVHQ8BAf8EBAMCBaAwSQYDVR0gBEIwQDA+BgZngQwB
AgEwNDAyBggrBgEFBQcCARYmaHR0cHM6Ly93d3cuZ2xvYmFsc2lnbi5jb20vcmVw
b3NpdG9yeS8wYgYDVR0RBFswWYINKi5oYXNrZWxsLm9yZ4IYYXV0b2Rpc2NvdmVy
Lmhhc2tlbGwub3JnghBtYWlsLmhhc2tlbGwub3Jngg9vd2EuaGFza2VsbC5vcmeC
C2hhc2tlbGwub3JnMAkGA1UdEwQCMAAwHQYDVR0lBBYwFAYIKwYBBQUHAwEGCCsG
AQUFBwMCMEMGA1UdHwQ8MDowOKA2oDSGMmh0dHA6Ly9jcmwuZ2xvYmFsc2lnbi5j
b20vZ3MvZ3Nkb21haW52YWxzaGEyZzIuY3JsMIGUBggrBgEFBQcBAQSBhzCBhDBH
BggrBgEFBQcwAoY7aHR0cDovL3NlY3VyZS5nbG9iYWxzaWduLmNvbS9jYWNlcnQv
Z3Nkb21haW52YWxzaGEyZzJyMS5jcnQwOQYIKwYBBQUHMAGGLWh0dHA6Ly9vY3Nw
Mi5nbG9iYWxzaWduLmNvbS9nc2RvbWFpbnZhbHNoYTJnMjAdBgNVHQ4EFgQUrQzy
kNxJj7oakCyrdIMDlinRuWAwHwYDVR0jBBgwFoAU6k581IAt5RWBhiaMgm3AmKTP
lw8wDQYJKoZIhvcNAQELBQADggEBAC/UdRPMPmkKH6196Zrp2ycNaLJKdMNZzYno
RMNYltfI4+7YVu9u6XgR77cDZBZHGNodzrKcVlmz88Km0SASgshk6PR9AZT+gzld
Qe9bg9e4O8tk5tpswXcHAdvmjD0iJSrWI57yiLRavi8Z/MwprTJ3FpIX4wX+5kVi
pRjPqCBknWkkw4bC3V1Hgp0PNcHHnj8XKXZC+SpOz5hGw99S/0373nPJ0SeRNNjb
AL/Pl4lc3LUxaUmBmqri9F2Fdh6AriBDcjw4C4pm4mkHXF1GzC+MOt5teCJh+ko/
ANLs9bO+LX4EeAfN4pM4fW2zeHw5wo21n7MYh9FvAF3O73UUHlA=
-----END CERTIFICATE-----
subject=/OU=Domain Control Validated/CN=*.haskell.org
issuer=/C=BE/O=GlobalSign nv-sa/CN=GlobalSign Domain Validation CA - SHA256 - G2
---
No client certificate CA names sent
---
SSL handshake has read 4055 bytes and written 328 bytes
---
New, TLSv1/SSLv3, Cipher is DHE-RSA-AES256-SHA
Server public key is 2048 bit
Secure Renegotiation IS supported
Compression: NONE
Expansion: NONE
SSL-Session:
    Protocol  : TLSv1
    Cipher    : DHE-RSA-AES256-SHA
    Session-ID: 468B186A5F98E558CB0E08744F78EDC668FA6AD224A41F462556E88EABEE2975
    Session-ID-ctx:
    Master-Key: 3FA3372CB95F77FD67B7C05DBCAFCA6C76D200FEC5C95CC09E757A407F0D1AE77D67473380A1638F294E445E9BD8FF02
    Key-Arg   : None
    Start Time: 1422291465
    Timeout   : 300 (sec)
    Verify return code: 0 (ok)
---
closed
crisoagf commented 9 years ago

What I think is that probably neither bug is really a proper hoogle bug (in the sense that neither is caused by faulty code): in one case, the problem is a server misconfiguration and, in the other case, probably a misconfigured certificate store. Anyway you could check if that is your problem? Seen that you are in Mac OS and using Homebrew perhaps you could try out this superuser.com entry solution (the comment on the solution is relevant. The solution in your case should be analogous, but not the same, seen that you are using Homebrew and not MacPorts).

ndmitchell commented 9 years ago

I think this is just another instance of #96 (albeit one that seemingly occurred way before that one got broken...), so the fix there (of disabling certificate checking) should work similarly. Please can you reopen if the issue persists with 4.2.38.

Blaisorblade commented 9 years ago

@crisoagf thanks for the link, the outputs I posted were bogus; when I used the correct CA store path, I got the "certificate expired" error, and ended up with https://github.com/Homebrew/homebrew/issues/32251 which turns out to be an OpenSSL bug.

For gorier details (for future reference): the root certification authority's certificate expired in 2014 and was then extended, so Mac OS X has both the expired and the new certificate for the same authority. Homebrew dumps all OS X certificates to initialize OpenSSL certificate store, but OpenSSL stops at the expired certificate and ignores the newer one. My final solution (which I don't necessarily recommend) was just to erase the expired certificate and regenerate the store.

sudo security delete-certificate -Z 2F173F7DE99667AFA57AF80AA2D1B12FAC830338 /System/Library/Keychains/SystemRootCertificates.keychain
brew postinstall openssl