3118325ms th_a application.cpp:1051 startup_plugins ] Starting plugin elasticsearch
Thread 1 "witness_node" received signal SIGSEGV, Segmentation fault.
__strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:65
65 ../sysdeps/x86_64/multiarch/strlen-avx2.S: No such file or directory.
(gdb) bt
#0 __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:65
#1 0x00007ffff7bade10 in ?? () from /lib/x86_64-linux-gnu/libcurl.so.4
#2 0x00007ffff7bb73f8 in ?? () from /lib/x86_64-linux-gnu/libcurl.so.4
#3 0x00007ffff7bb89d1 in curl_multi_perform () from /lib/x86_64-linux-gnu/libcurl.so.4
#4 0x00007ffff7baee4b in curl_easy_perform () from /lib/x86_64-linux-gnu/libcurl.so.4
#5 0x0000555558afd1f8 in graphene::utilities::doCurl[abi:cxx11](graphene::utilities::CurlRequest&) ()
#6 0x0000555558afda00 in graphene::utilities::checkES(graphene::utilities::ES&) ()
#7 0x00005555583d8e31 in graphene::elasticsearch::elasticsearch_plugin::plugin_startup() ()
#8 0x00005555581038cb in graphene::app::detail::application_impl::startup_plugins() const ()
#9 0x000055555810b0ad in graphene::app::detail::application_impl::startup() ()
#10 0x000055555810b3a4 in graphene::app::application::startup() ()
#11 0x00005555580e47f4 in main ()
the issue can not be stably reproduced - it is random.
the curl object is used like a global variable in the program, it has a long lifetime. Since we do not do cleanups after each query, we need to overwrite or reset the options before every query.
we didn't specify CURLOPT_HTTPGET in doCurl() for GET, so it may be actually sending a POST, so libcurl may try to access CURLOPT_POSTFIELDS.
we didn't reset CURLOPT_POSTFIELDS in doCurl() for GET, which is a pointer and was pointing to a temporary variable which had been destructed already, the memory address may or may not be accessible. Anyway, accessing it is wrong.
... (CURLOPT_CUSTOMREQUEST) is particularly useful, for example, for performing an HTTP DELETE request.
To switch to a proper HEAD use CURLOPT_NOBODY, to switch to a proper POST use CURLOPT_POST or CURLOPT_POSTFIELDS and to switch to a proper GET use CURLOPT_HTTPGET.
By the way, there are quite some other design flaws in the plugin, ideally we should refactor it when got time.
Host Environment
Please provide details about the host environment. Much of this information can be found running: witness_node --version.
Host OS: Ubuntu 20.04.2 LTS
Host Physical RAM -
BitShares Version: 5.2.1
OpenSSL Version: 1.1.1f
Boost Version: 1.71
libcurl4-openssl-dev 7.68.0-1ubuntu2.6
Additional Context
The crash started to happen when I upgraded my server with sudo apt upgrade to upgrade ElasticSearch to 7.13.4, at the same time kernel and some other packages got upgraded too. After the upgrade, before a reboot, witness_node worked fine. After reboot, witness_node starts to crash.
The pre-built witness_node binary (with different versions of libraries statically linked) crashes too. So perhaps the issue is triggered by some changes in kernel.
Bug Description
It's caused by a bug inUpdate: that bug was fixed in curl 7.68.0-1ubuntu2.6 , actually our issue is different.curl
: https://github.com/curl/curl/issues/3548Update:
GET
.POST
sent before we sending theGET
(see https://github.com/bitshares/bitshares-core/issues/2494).CURLOPT_HTTPGET
indoCurl()
forGET
, so it may be actually sending aPOST
, solibcurl
may try to accessCURLOPT_POSTFIELDS
.CURLOPT_POSTFIELDS
indoCurl()
forGET
, which is a pointer and was pointing to a temporary variable which had been destructed already, the memory address may or may not be accessible. Anyway, accessing it is wrong.Host Environment Please provide details about the host environment. Much of this information can be found running:
witness_node --version
.Additional Context The crash started to happen when I upgraded my server with
sudo apt upgrade
to upgrade ElasticSearch to7.13.4
, at the same time kernel and some other packages got upgraded too. After the upgrade, before a reboot,witness_node
worked fine. After reboot,witness_node
starts to crash.The pre-built
witness_node
binary (with different versions of libraries statically linked) crashes too. So perhaps the issue is triggered by some changes in kernel.