herumi / bls

288 stars 132 forks source link

Segmentation fault occurs when mclBnG1 instance is created during static initialization phase #98

Closed gogoex closed 1 year ago

gogoex commented 1 year ago

When mclBnG1 instance is created as (or under) a static variable, mcl::FpT<mcl::bn::local::FpTag, 384ul>::clear is called during the static initialization phase the main function is called. Then that results in a segmentation fault.

From experiments, I found in non-static context, calling blsInit before instantiating mclBnG1 instance solves the problem. But I haven't been able to find a way to do it in static context.

To build the library, I'm using:

make -C mcl lib/libmcl.a
make BLS_ETH=1 lib/libbls384_256.a

and the library is used this way:

#define BLS_ETH 1
#include <bls/bls384_256.h>

GDB back trace:

(gdb) run
Starting program: /home/../repos/mcl-bls-sample/test
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
b
Program received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x000055555555b42a in mcl::FpT<mcl::bn::local::FpTag, 384ul>::clear (
    this=0x55555561a1c0 <A::p>)
    at /home/../repos/mcl-bls-sample/src/bls/mcl/include/mcl/fp.hpp:254
#2  mcl::EcT<mcl::FpT<mcl::bn::local::FpTag, 384ul> >::clear (this=0x55555561a1c0 <A::p>)
    at /home/../repos/mcl-bls-sample/src/bls/mcl/include/mcl/ec.hpp:1305
#3  mclBnG1_clear (x=0x55555561a1c0 <A::p>)
    at /home/../repos/mcl-bls-sample/src/bls/mcl/include/mcl/impl/bn_c_impl.hpp:370
#4  0x0000555555558e4e in MclWrapper::MclWrapper (this=0x55555561a1c0 <A::p>)
    at src/mcl_wrapper.cpp:7
#5  0x0000555555558e15 in __static_initialization_and_destruction_0 (__initialize_p=1,
    __priority=65535) at src/main.cpp:7
#6  0x0000555555558e2f in _GLOBAL__sub_I__ZN1A1pE () at src/main.cpp:12
#7  0x000055555560339d in __libc_csu_init ()
#8  0x00007ffff7bbf010 in __libc_start_main (main=0x555555558dd9 <main()>, argc=1,
    argv=0x7fffffffdc08, init=0x555555603350 <__libc_csu_init>, fini=<optimized out>,
    rtld_fini=<optimized out>, stack_end=0x7fffffffdbf8) at ../csu/libc-start.c:264
#9  0x0000555555558d1e in _start ()
    at /home/../repos/mcl-bls-sample/src/bls/mcl/include/mcl/fp.hpp:244

The environment:

I prepared a minimum example that can reproduce the problem: https://github.com/gogoex/mcl-bls-sample

herumi commented 1 year ago

If you want to call mclBnG1_clear for a static variable, then it is not necessary because mclBnG1 is an array of uint64_t and it will be filled with zero.

herumi commented 1 year ago

I prepared a minimum example that can reproduce the problem: https://github.com/gogoex/mcl-bls-sample

That sample does not call mclBnG1_clear, so the problem does not reproduce. If there is one file to have static variables, how about calling blsInit in the other static variable?

#include "./mcl_wrapper.h"

struct Init {
    Init()
    {
        int ret = blsInit(MCL_BLS12_381, MCLBN_COMPILED_TIME_VAR);
        printf("ret=%d\n", ret);
    }
} s_init; // the constructor will be called at first before the other static variables.

mclBnG1 MclWrapper::m_p;
gogoex commented 1 year ago

If you want to call mclBnG1_clear for a static variable, then it is not necessary because mclBnG1 is an array of uint64_t and it will be filled with zero.

The problem is that even if we don't call any Mcl function, if mclBnG1 variable is declared as a static variable, mcl::FpT<mcl::bn::local::FpTag, 384ul>::clear is called while static variables are instantiated.

gogoex commented 1 year ago

I prepared a minimum example that can reproduce the problem: https://github.com/gogoex/mcl-bls-sample

That sample does not call mclBnG1_clear, so the problem does not reproduce. If there is one file to have static variables, how about calling blsInit in the other static variable?

#include "./mcl_wrapper.h"

struct Init {
    Init()
    {
        int ret = blsInit(MCL_BLS12_381, MCLBN_COMPILED_TIME_VAR);
        printf("ret=%d\n", ret);
    }
} s_init; // the constructor will be called at first before the other static variables.

mclBnG1 MclWrapper::m_p;

Thanks. We have actually tried the idea in our actual codebase, and that solved the problem. But for some reason that caused memory leaks and made address sanitizer checks fail. I will try to reproduce the situation using the minimum example.

herumi commented 1 year ago

The problem is that even if we don't call any Mcl function, if mclBnG1 variable is declared as a static variable, mcl::FpT<mcl::bn::local::FpTag, 384ul>::clear is called while static variables are instantiated.

It seems strange. I tested your code, but it does not show any error.

diff --git a/src/main.cpp b/src/main.cpp
index 03e4e60..ed87c57 100644
--- a/src/main.cpp
+++ b/src/main.cpp
@@ -1,6 +1,8 @@
 #include "./mcl_wrapper.h"
+#include <stdio.h>

 int main() {
     MclWrapper x;
+ puts("ok");
     return 0;
g++ -g -O2 -Wall -I/usr/local/include -Isrc/bls/include -Isrc/bls/mcl/include -I/usr/local/lib -I/usr/local/include -I/usr/local/include -c main.cpp -o main.o
main.cpp: In function 'int main()':
main.cpp:5:16: warning: unused variable 'x' [-Wunused-variable]
    5 |     MclWrapper x;
      |                ^
mcl-bls-sample% ./test
ok

Ubuntu 22.04.2 LTS + g++-11.3.0 + Intel CPU with -O0 and -O2 option.

herumi commented 1 year ago

Could you try libbls384_256.a in https://github.com/herumi/bls-eth-go-binary/tree/master/bls/lib/linux/amd64 ? This is a compiled library with BLS_ETH=1, which contains libmcl.a, so you don't need link libmcl.a.

gogoex commented 1 year ago

Thank you. I tried but I'm still seeing the seg fault. My environment is Ubuntu 20.04 LTS + g++-9.4.0 + AMD Ryzen 9 with -O0 option. Do you have any idea?

g++ main.o mcl_wrapper.o herumi/libbls384_256.a src/bls/mcl/lib/libmcl.a -lpthread -o test
ryzen (gogoex: main): ~/repos/mcl-bls-sample
$ make run
./test
make: *** [Makefile:38: run] Segmentation fault (core dumped)
ryzen (gogoex: main): ~/repos/mcl-bls-sample
$ g++ --version
g++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
herumi commented 1 year ago

g++ main.o mcl_wrapper.o herumi/libbls384_256.a src/bls/mcl/lib/libmcl.a -lpthread -o test

src/bls/mcl/lib/libmcl.a is not necessary for https://github.com/herumi/bls-eth-go-binary/blob/master/bls/lib/linux/amd64/libbls384_256.a .

Can you make the binary by g++ main.o main.o mcl_wrapper.o -o test? main.cpp and mcl_wrapper.cpp do not call any mcl functions, so I can build test without linking mcl library.

gogoex commented 1 year ago

Sorry. I realized that my make clean was not deleting object files and because of that, an older version of mcl_wrapper.o that calls mclBnG1_clear in the constructor had been used. Now I don't see segmentation fault either.

Following your first comment, I will try to remove mclBnG1_clear call from a class constructor that is instantiated with a static variable.

gogoex commented 1 year ago

I could confirm that not calling mclBnG1_clear resolves the segfault problem. Thank you for your help. I will close this issue.

But for some reason, I observe that the initial values change when we try with our codebase. I will create a minimum code to confirm it and if needed, I will create a issue.