Closed LuoGuiwen closed 2 years ago
No. You're welcome.
But on a more constructive note. Come on! It surely doesn't just say "it doesn't work" on the screen. If you want better answer, provide details :-) Language? How exactly does it fail?
Thank you for your prompt reply and please excuse me for the fuzziness. Let me try to make it more clear.
I create a file called main_test.cpp
as follow:
// Compile: g++ -std=c++17 -o main_test -g -O2 main_test.cpp libblst.a
#include "bindings/blst.h"
#include <cstdint>
#include <iostream>
#include <random>
// using namespace std;
/* Define ostreams */
std::ostream& operator<<(std::ostream& os, const blst_fp& b)
{
os << std::hex << b.l[0] << " "<< b.l[1]<< " "<< b.l[2]<< " "<< b.l[3] <<" "<< b.l[4] <<" "<< b.l[5] <<" ";
return os;
}
/* Define global variables */
const size_t N_POINTS = (size_t) (1 << 16); // I cannot test 2**19 and 2**20, it will cause segmentation fault.
const blst_p1 G1_BLST_DEFAULT_GENERATOR = *blst_p1_generator(); // Default generator in blst_p1/
blst_fr FR_ONE;
blst_fp FP_ONE;
blst_p1 G1_GENERATOR;
blst_p1_affine G1_GENERATOR_AFFINE;
void init(){
// initialize the identity, i.e., one, in fr.
uint64_t blst_fr_one_vec[] = {uint64_t(1),uint64_t(0),uint64_t(0),uint64_t(0)};
blst_fr_from_uint64(&FR_ONE, blst_fr_one_vec);
// initialize the identity, i.e., one, in fp.
uint64_t fp_one_vec[] = {uint64_t(0x1), uint64_t(0x0), uint64_t(0x0), uint64_t(0x0), uint64_t(0x0), uint64_t(0x0)};
blst_fp_from_uint64(&FP_ONE, fp_one_vec);
/*
initialize the generator in blst_p1 by Guiwen. G1_GENERATOR = 11 * 10177 * 859267 * 52437899* (Point whose x-coordinate is 4).
G1_GENERATOR =
{0x7f127a0a6f06434698a6b6598fc6d8bd7e8482362c69b416d8640c18c1caec0ab474874acad9e91be475966f7413a26,
0x5be03c7afc54a0b30376055f27a4ff60e8ca9060651b98fa6caa6937bed9116b52ad54fbc4e22cd69b8519cb9bfd662}
*/
uint64_t G1x_vec[] = {uint64_t(0xbe475966f7413a26), uint64_t(0xab474874acad9e91), uint64_t(0x6d8640c18c1caec0), uint64_t(0xd7e8482362c69b41), uint64_t(0x698a6b6598fc6d8b), uint64_t(0x07f127a0a6f06434)} ;
uint64_t G1y_vec[] = {uint64_t(0x69b8519cb9bfd662), uint64_t(0xb52ad54fbc4e22cd), uint64_t(0xa6caa6937bed9116), uint64_t(0x0e8ca9060651b98f), uint64_t(0x30376055f27a4ff6), uint64_t(0x05be03c7afc54a0b)} ;
blst_fp G1x, G1y;
blst_fp_from_uint64(&G1x, G1x_vec);
blst_fp_from_uint64(&G1y, G1y_vec);
G1_GENERATOR = {G1x, G1y, FP_ONE}; // integer a -> a << 384 mod p in montgomery form
G1_GENERATOR_AFFINE = {G1x, G1y};
std::cout << "Check G1_generator is in G1: " << blst_p1_in_g1(&G1_GENERATOR) <<std::endl;
}
/* */
blst_scalar random_blst_scalar(){
// credit to url: https://stackoverflow.com/questions/19665818/generate-random-numbers-using-c11-random-library
std::random_device rd;
std::mt19937_64 mt(rd());
std::uniform_real_distribution<uint64_t> dist(1, (uint64_t) 0xffffffffffffffff);
uint64_t scalar_vec[4];
scalar_vec[0] = dist(mt);
scalar_vec[1] = dist(mt);
scalar_vec[2] = dist(mt);
scalar_vec[3] = dist(mt);
blst_scalar scalar;
blst_scalar_from_uint64( &scalar, scalar_vec);
return scalar;
}
void test(){
size_t nbits = 256;
// std::cout << "scratch size: "<< blst_p1s_mult_pippenger_scratch_sizeof((1<< 19))/sizeof(limb_t) << std::endl;
limb_t scratch[blst_p1s_mult_pippenger_scratch_sizeof(N_POINTS)/sizeof(limb_t)];
uint8_t* scalars[N_POINTS];
for(size_t i = 0; i < N_POINTS; ++i){
scalars[i] = random_blst_scalar().b;
}
blst_p1_affine* points[N_POINTS];
blst_p1 tmp_P, tmp2_P = G1_GENERATOR;
blst_p1_affine tmp_P_affine;
for(size_t i = 0; i < N_POINTS; ++i){
blst_p1_double(&tmp_P, &tmp2_P);
tmp2_P = tmp_P;
blst_p1_to_affine(&tmp_P_affine, &tmp_P);
points[i] = &tmp_P_affine;
}
blst_p1 ret_P; // Mont coordinates
blst_p1_affine ret_P_affine;
/*blst pippenger*/
std::cout<< "blst pippenger test: " << std::endl;
size_t TEST_NUM = 1;
auto st = std::chrono::steady_clock::now();
for(size_t i = 0; i< TEST_NUM; ++i)
blst_p1s_mult_pippenger(&ret_P, points, N_POINTS, scalars, nbits, scratch);
auto ed = std::chrono::steady_clock::now();
std::chrono::microseconds diff = std::chrono::duration_cast<std::chrono::microseconds>(ed -st);
std::cout << "blst pippenger Wall clock time elapse is: " << std::dec << diff.count()/TEST_NUM << " us "<< std::endl;
blst_p1_to_affine(&ret_P_affine, &ret_P);
std::cout << ret_P_affine.x <<std::endl;
std::cout << ret_P_affine.y <<std::endl;
std::cout << "TEST END" <<std::endl;
}
int main(){
init();
test();
return 0;
}
on the 2018 MacBook Pro with Mac OS system, 16 GB RAM,2.2 GHz 6-Core Intel Core i7, I compile it using
g++ -std=c++17 -o main_test -g -O2 main_test.cpp libblst.a
.
When const size_t N_POINTS = (size_t) (1 << 16)
, ./main_test
runs successfully.
When const size_t N_POINTS = (size_t) (1 << 19)
, ./main_test
says segmentation fault
.
My guess is that the fault is related to scratch
variable, which I didn't understand its functionality. Maybe my laptop RAM is too small?
Can you please explain me in fucntion blst_p1s_mult_pippenger(&ret_P, points, N_POINTS, scalars, nbits, scratch)
, what the nbits
and scratch
are? and why my code results segmentation fault
?
Thanks a lot, Guiwen
C++, no scratch, segmentation fault.
If you don't allocate the scratch area, it will use the stack as scratch. The stack is limited in size, customarily 8MB, and attempt to use more than that is bound to trigger segmentation violation. So no mystery that ~it~ your application fails for larger amount of points. Just allocate the scratch.
nbits
is the number of bits in scalars. Point is that there is application for shorter scalars, and providing a way to specify the scalars' width allows applications to minimize the memory footprint. If you work with full-width fully-reduced scalar values, pass the default 255.
I did allocate the scratch area, as you would see the 3rd line of test()
function in my code,
limb_t scratch[blst_p1s_mult_pippenger_scratch_sizeof(N_POINTS)/sizeof(limb_t)];
.
The only difference is I assigned nbits = 256
. I just tried nbits = 255
, It showed segmentation fault
. I also tried to allocate the scratch area using new
as you suggested, it still showed segmentation fault
.
By the way, when I compiled the code, I ignored the warning, which says
ld: warning: could not create compact unwind for _blst_sha256_block_data_order: does not use RBP or RSP based frame
. Is this a problem?
Thanks.
You allocate the scratch area on the stack and it's essentially the same as not allocating one. Sorry about the confusion. The scratch area should be allocated from the pool, where there is no implied limit. I mean at least not as low as 8MB.
No, the warning is not a problem.
Thank you. I kind of understand it.
Could you please let me know how I can allocate memory from the main pool rather than stack? I tried to define scratch
as a global variable and use operator new
to allocate memory for it, but this method doesn't work.
I'm sorry, but no. This is a generic textbook question, and this forum is not a classroom. Besides, a reference to a working code was provided.
Thank you so much dot-asm. I finally managed to run the code by unleashing the stack restriction with ulimit -s unlimited
in terminal.
Thank you so much dot-asm. I finally managed to run the code by unleashing the stack restriction with
ulimit -s unlimited
in terminal.
Respectfully, you need to review variable declaration stack and heap allocation in C and C++.
You are going to trigger significant issues with ulimit
, first of all, the fact that your application will not work on Windows or when you do not have the admin rights to change those critical safety features. And second, the fact that people can now DOS your machine with recursive function calls.
I did allocate the scratch area, as you would see the 3rd line of test() function in my code
No, you are declaring the array on the stack, review carefully the working code given.
I tried to define scratch as a global variable and use operator new to allocate memory for it, but this method doesn't work.
Either you use new
or malloc
/free
. Please ask your advisor for material on this basic C/C++ programming question.
I'm trying to compute multi-scalar multiplication using
blst_p1s_mult_pippenger
function. when I trynpoints = (size_t) 1 << 18
, it works smoothly. Butnpoints = (size_t) 1 << 19
and larger does not work. Is there any restriction to the maximum number of points supported?Thanks.