Closed cyphersnake closed 1 month ago
I can generalize and leave this code in main for an example so we can measure memory when needed. Let me know if such measurements are one of our target metrics
Yes, we are interested in memory usage of different circuits in the long term. We care about memory usage (determines total RAM needed) similar as we care about runtime.
In this table {PRIMARY,SECONDARY}_CIRCUIT_K_TABLE_SIZE
is equal to k
, otherwise the algorithms are identical - fold 1 step
I used COMMITMENT_KEY_SIZE
=27 so that I wouldn't have to change it between cases (it would have affected the size)
For 20, 21 - in process (with a custom allocator, it's going to be a long time)
| k | total_bytes | total_blocks | t_gmax | t_gmax_blocks | t_end | t_end_blocks |
| --- | ------------- | -------------- | -------- | --------------- | ------- | --------------- |
| 17 | 46,470,334,145 | 8,384,081 | 26,415,304,923 | 15,860 | 958,633 | 1,566 |
| 18 | 66,329,954,655 | 13,391,175 | 27,059,130,587 | 15,860 | 958,633 | 1,566 |
| 19 | 103,941,909,225 | 22,422,684 | 28,346,781,915 | 15,860 | 958,633 | 1,566 |
| 20 | 177,592,778,430 | 39,557,793 | 30,922,084,571 | 15,860 | 958,633 | 1,566 |
| 21 | 323,176,881,971 | 71,974,399 | 36,072,689,883 | 15,860 | 958,633 | 1,566 |
Our memory footprint is quite large: 46GB - 100GB. It's different from what people say: folding schemes have small memory footprint. Is it because of inefficient implementation?
Our memory footprint is quite large: 46GB - 100GB. It's different from what people say: folding schemes have small memory footprint. Is it because of inefficient implementation?
I was more concerned with showing Chao dependence, it's not really representative of how much we're eating from/to, because the commitment_key I made large (2^27) to smooth out the difference between different k's & this include calc
digest` with same key serialization
So let me do what you want me to do - performance measurements specifically of the folding scheme
Our memory footprint is quite large: 46GB - 100GB. It's different from what people say: folding schemes have small memory footprint. Is it because of inefficient implementation?
k-table-size = 17 commitment-key = 21
Total: 19,449,044,966 bytes in 6,417,946 blocks At t-gmax: 1,293,397,125 bytes in 28,956 blocks At t-end: 161,193 bytes in 222 blocks
I looked at the allocation breakdown, 68% is the intermediate computation cache for GraphEvaluator
. Evalution the actual value of combined gates on each row in parallel using rayon. Two identical gates are combined here, so the difference from "raw halo2" is not significant, almost same expressions
It's just a reuse of the infrastructure from halo2, so it's not as directly related to folding itself. But after implementing ProtoGALAXY specifically this part won't change much, so we can think about how to rethink that part
dhat-heap.json https://nnethercote.github.io/dh_view/dh_view.html
@drouyang
Generalize in #276
Issue Link / Motivation
272
Changes Overview Use dhat