apache / couchdb

Seamless multi-master syncing database with an intuitive HTTP/JSON API, designed for reliability
https://couchdb.apache.org/
Apache License 2.0
6.26k stars 1.03k forks source link

CouchDB 3.0 Windows 10 crashing as soon as it starts #2680

Closed nkev closed 4 years ago

nkev commented 4 years ago

Description

I can't run couchdb v3.0 on Windows 10 Version 10.0.18362 Build 18362. I am logged on to Windows as Administrator I have .NET Framework 3.5 installed.

After I install apache-couchdb-3.0.0.msi the URL https://localhost:5984/_utils/ returns ERR_CONNECTION_REFUSED

The Windows Event Log shows that nssm.exe is constantly trying to restart couchdb.cmd but it fails every time. Here are the 4 nssm logs in WIndows Event Viewer (each line is a separate entry). The four lines keep repeating in the log so I killed the nssm.exe process for now:

Started C:\CouchDB\bin\couchdb.cmd  for service Apache CouchDB in C:\CouchDB\bin.
Program C:\CouchDB\bin\couchdb.cmd for service Apache CouchDB exited with return code 1.
Killing process tree of process 19792 for service Apache CouchDB with exit code 1
Killing PID 19792 in process tree of PID 19792 because service Apache CouchDB is stopping.

If I run C:\CouchDB\bin\couchdb.cmd manually I get:

C:\CouchDB\bin>couchdb.cmd
kernel-poll not supported; "K" parameter ignored
{"Kernel pid terminated",application_controller,"{application_start_failure,couch_epi,{{shutdown,{failed_to_start_child,\"couch_epi|chttpd_auth|keeper\",{undef,[{crypto,hash,[md5,<<131,106>>],[]},{couch_epi_util,hash,1,[{file,\"src/couch_epi_util.erl\"},{line,25}]},{couch_epi_functions,data,1,[{file,\"src/couch_epi_functions.erl\"},{line,33}]},{couch_epi_module_keeper,do_reload_if_updated,1,[{file,\"src/couch_epi_module_keeper.erl\"},{line,116}]},{gen_server,init_it,2,[{file,\"c:/relax/otp/lib/stdlib/src/gen_server.erl\"},{line,365}]},{gen_server,init_it,6,[{file,\"c:/relax/otp/lib/stdlib/src/gen_server.erl\"},{line,333}]},{proc_lib,init_p_do_apply,3,[{file,\"c:/relax/otp/lib/stdlib/src/proc_lib.erl\"},{line,247}]}]}}},{couch_epi_app,start,[normal,[]]}}}"}
Kernel pid terminated (application_controller) ({application_start_failure,couch_epi,{{shutdown,{failed_to_start_child,"couch_epi|chttpd_auth|keeper",{undef,[{crypto,hash,[md5,<<131,106>>],[]},{couch_

Crash dump is being written to: erl_crash.dump...C:\CouchDB>

I can send you the crash dump on request.

Steps to Reproduce

Follow official installation steps for Windows 10

wohali commented 4 years ago

Looks like the crypto libraries aren't running correctly on this install for you.

Can you reproduce this on another Windows 10 install? Others are not finding this issue.

nkev commented 4 years ago

I don't have another Windows machine right now unfortunately. Is there a way to fix the crypto libraries? I did have 2.3.1 running on this same WIndows 10 PC. I uninstalled it before installing v3.0

wohali commented 4 years ago

Double-check for me the listing of all the DLLs in your CouchDB\bin directory? Paste a directory listing into this issue please.

Did you happen to already have OpenSSL installed on this machine anywhere?

browntownington commented 4 years ago

Hi I can confirm that I have similar if not the same issue as OP. CouchDB 3.0 Windows 10.

In windows event viewer:

Started C:\CouchDB\bin\couchdb.cmd  for service Apache CouchDB in C:\CouchDB\bin.

Program C:\CouchDB\bin\couchdb.cmd for service Apache CouchDB exited with return code 1.

Killing process tree of process 21608 for service Apache CouchDB with exit code 1

Killing PID 21608 in process tree of PID 21608 because service Apache CouchDB is stopping.

Service Apache CouchDB action for exit code 1 is Restart. Attempting to restart C:\CouchDB\bin\couchdb.cmd.

Started C:\CouchDB\bin\couchdb.cmd  for service Apache CouchDB in C:\CouchDB\bin.

REPEAT..

Identical output when I run C:\CouchDB\bin\couch

C:\CouchDB>C:\CouchDB\bin\couchdb.cmd
kernel-poll not supported; "K" parameter ignored
2020-03-21 09:06:18 inet_parse:~p:~p: erroneous line, SKIPPED~n
        "c:/WINDOWS/System32/drivers/etc/hosts"
        1
{"Kernel pid terminated",application_controller,"{application_start_failure,couch_epi,{{shutdown,{failed_to_start_child,\"couch_epi|chttpd_auth|keeper\",{undef,[{crypto,hash,[md5,<<131,106>>],[]},{couch_epi_util,hash,1,[{file,\"src/couch_epi_util.erl\"},{line,25}]},{couch_epi_functions,data,1,[{file,\"src/couch_epi_functions.erl\"},{line,33}]},{couch_epi_module_keeper,do_reload_if_updated,1,[{file,\"src/couch_epi_module_keeper.erl\"},{line,116}]},{gen_server,init_it,2,[{file,\"c:/relax/otp/lib/stdlib/src/gen_server.erl\"},{line,365}]},{gen_server,init_it,6,[{file,\"c:/relax/otp/lib/stdlib/src/gen_server.erl\"},{line,333}]},{proc_lib,init_p_do_apply,3,[{file,\"c:/relax/otp/lib/stdlib/src/proc_lib.erl\"},{line,247}]}]}}},{couch_epi_app,start,[normal,[]]}}}"}
Kernel pid terminated (application_controller) ({application_start_failure,couch_epi,{{shutdown,{failed_to_start_child,"couch_epi|chttpd_auth|keeper",{undef,[{crypto,hash,[md5,<<131,106>>],[]},{couch_

Crash dump is being written to: erl_crash.dump...done

The lines 3-5 in the above are the only differences in the output of OP.

I do not have OpenSSL installed (that I am ware of) but I do have OpenVPN(client) installed.

DLL listing in CouchDB/Bin is as follows: icudt61 icuin61 icuio61 icutu61 icuuc61 libcrypto-1_1-x64 libcurl libssl-1_1-x64 mozjs-60 nspr4 plc4 plds4 zlib1

wohali commented 4 years ago

Lines 3-5 should not matter.

Please confirm the specific version of Windows 10 you're on - the build number - and please confirm you're on 64-bit?

wohali commented 4 years ago

My problem here is that I can't reproduce this problem on a brand new Windows 10 64-bit (19xx build number) on an AWS instance. Without a reproducible bug here, I'm not sure what step to take next.

The core part of the error is: {undef,[{crypto,hash,[md5,<<131,106>>] which means the crypto:hash/2 function doesn't recognize md5 as a valid hash type. The erlang crypto module relies on the algorithms in OpenSSL for that support, which is why those are included in the binary download.

The only theory I have is that something in your PATH is intercepting the DLL linkage to the libcrypto/libssl libraries.

Things you could try:

Let me know if any of this helps.

browntownington commented 4 years ago

@wohali thanks for putting some solid thought into this issue...

My windows 10 laptop (with this issue) Microsoft Windows [Version 10.0.17763.1098] - 64bit

I have a fresh install of windows 10 on a laptop and confirm it's working without issues. Microsoft Windows [Version 10.0.18363.657] - 64bit

I have attached the screenshots of SYSTEM and USER PATH. I have alot of variables so it might take me time to remove everything.

I will advise on the search results soon.

system-path user-path

nkev commented 4 years ago

@browntownington Did you previously have CouchDB v2.3.1 installed like me? If so, maybe @wohali could try installing that first and upgrading to v3.0 to try and reproduce the issue.

@wohali Here is the info you asked for:

  1. CouchDB/bin contents: image

  2. I do have OpenSSL installed image

browntownington commented 4 years ago

@nkev I'm brand new to couchdb... v3.0 is my first install / experience.

I suspect @wohali is on the right track with lilcrypto and lilssl / conflicting PATH variables.

@wohali can you explain alittle more about how couchdb uses libssl and libcrypto?

When I do a search on libssl and libcrypto in the C:\CouchDB\bin I get the two following files libssl-1_1-x64.dll libcrypto-1_1-x64.dll

If I do a search on those exact same files (same file names) - I found 12 results (as displayed in the screen shots)

libssl-search libcrypto-search

If I search 'libssl' and 'libcrypto' I get 100's of results so I assume CouchDB is specially looking for libssl-1_1-x64.dll and libcrypto-1_1-x64.dll? libcrypto.dll and libssl.dll live in system32

nkev commented 4 years ago

@browntownington It looks like you also have CouchBase installed. I don't have it installed for the record, and I'm not sure whether that would cause any clashes, but it's something worth noting in case your issue is different.

I just uninstalled OpenSSL and CouchDB 3.0, deleted the C:\CouchDB residue folder, rebooted Windows, installed CouchDB 3.0 again and unfortunately, I still have the same issue.

Windows Event Viewer again shows CouchDB keeps starting several times a second:

image

Here are my Local Disk searches on libssl and libcrypto:

image

image

browntownington commented 4 years ago

@nkev I think I am going to take @wohali 's original advise and remove ALL PATH variables, test, then add them back in one by one and test. Will report back soon.

nkev commented 4 years ago

Ok. The odd thing is CouchDB v2.3.1 installs and works fine, even with all those other DLLs. CouchDB v3.0 obviously does something differently.

browntownington commented 4 years ago

@nkev @wohali I've resolved the issue. It took abit of time to remove ALL USER and SYSTEM PATH variables and add them in one at a time (rebooting system each time.)

And the culprit was....

OPENVPN

Which is funny because I suspected this originally. I'm not sure what implications removing this from the PATH variable with have on OpenVPN (but since I am not suing the client at the moment it's not so much an issue.)

I will be curious to see what can be done about this compatibility in the future.

browntownington commented 4 years ago

It makes sense that it was OpenVPN. Because both OpenVPN and COuchDB both have the exact same file names libssl-1_1-x64.dll and libcrypto-1_1-x64.dll - and OpenVPN had a SYSTEM variable. So CouchDB was obviously using OpenVPN's libssl-1_1-x64.dll and libcrypto-1_1-x64.dll instead of it's own?!

wohali commented 4 years ago

Hey @browntownington , thanks for the detective work for this issue. The way Windows library binding works, it'll hunt for the DLL in the PATH first.

A fix for us will be to 100% override the path in the couchdb.cmd file. We will forcibly restrict the PATH variable to just the Windows directory and CouchDB itself. That should resolve the problem, since CouchDB is self-contained (except for the C++ libraries we need from Microsoft.)

I'll look at this next week, the fix will be in 3.0.1.

nkev commented 4 years ago

That's great work @browntownington. Thank you for your effort. I'll try that soon and report back to hopefully close this issue.

nkev commented 4 years ago

I removed the conflicting PATH variables and my system also works now thanks to both of you. However, Fauxton only works through HTTP now, not HTTPS but I can live with that and I will wait for v3.0.1+ instead of tackling that!

Thank you both again.

nkev commented 4 years ago

@wohali, FYI, I just noticed v3.0 also only works on HTTP not HTTPS on my Macbook.

wohali commented 4 years ago

@nkev Correct, CouchDB never shipped out of the box with HTTPS support - not without you putting in your own certificates and setting up the config yourself.

wohali commented 4 years ago

If you need a quick fix for this, in your couchdb.cmd file, change this line:

set PATH=%PATH%;%COUCHDB_BIN_DIR%

to this:

set PATH=%COUCHDB_BIN_DIR%;%SystemRoot%\system32;%SystemRoot%;%SystemRoot%\System32\Wbem;%SYSTEMROOT%\System32\WindowsPowerShell\v1.0\

We'll have this patched in CouchDB 3.0.1+.