Investigation: Why do some benchmarks have significantly increased memory usage with the new incremental GC?

Looking at the new incremental GC and its subsequent bugfix together, it's interesting that some benchmarks use significantly more memory. In particular, mypy2, pycparser and docutils.

Data

![image](https://github.com/faster-cpython/ideas/assets/38294/19f35e20-fea8-428c-8ece-3556d7d17b6d) | benchmark | base | head | diff | | --- | ---: | ---: | ---: | | 2to3 | 23,279,323 | 23,574,625 | 295,302 | | aiohttp | 30,047,865 | 29,992,862 | -55,003 | | async_generators | 36,232,825 | 36,214,101 | -18,724 | | async_tree_none | 80,446,220 | 80,228,937 | -217,283 | | async_tree_cpu_io_mixed | 83,602,090 | 83,824,640 | 222,550 | | async_tree_cpu_io_mixed_tg | 92,690,529 | 93,186,925 | 496,396 | | async_tree_io | 136,706,145 | 135,888,700 | -817,445 | | async_tree_io_tg | 145,445,644 | 146,071,747 | 626,103 | | async_tree_memoization | 86,078,610 | 85,938,371 | -140,239 | | async_tree_memoization_tg | 95,327,768 | 95,374,189 | 46,421 | | async_tree_none_tg | 89,782,369 | 89,443,961 | -338,408 | | asyncio_tcp | 50,583,454 | 51,326,390 | 742,936 | | asyncio_tcp_ssl | 76,057,648 | 76,106,020 | 48,372 | | asyncio_websockets | 65,070,811 | 64,648,728 | -422,083 | | chameleon | 29,368,905 | 29,676,300 | 307,395 | | chaos | 17,821,110 | 17,829,107 | 7,997 | | comprehensions | 19,322,782 | 19,364,717 | 41,935 | | bench_mp_pool | 20,654,762 | 20,830,890 | 176,128 | | coroutines | 17,117,379 | 17,019,270 | -98,109 | | crypto_pyaes | 17,455,981 | 17,514,691 | 58,710 | | dask | 88,305,273 | 88,550,448 | 245,175 | | deepcopy | 18,866,956 | 18,866,176 | -780 | | deepcopy_reduce | 18,834,578 | 18,855,643 | 21,065 | | deepcopy_memo | 18,891,727 | 18,836,918 | -54,809 | | deltablue | 19,162,648 | 21,385,216 | 2,222,568 | | django_template | 43,785,069 | 43,736,697 | -48,372 | | djangocms | 30,294,016 | 30,260,467 | -33,549 | | docutils | 64,943,835 | 85,793,840 | 20,850,005 | | dulwich_log | 26,350,348 | 25,903,299 | -447,049 | | fannkuch | 17,007,957 | 16,951,393 | -56,564 | | float | 34,400,158 | 34,406,009 | 5,851 | | create_gc_cycles | 17,081,685 | 17,093,973 | 12,288 | | gc_traversal | 20,627,651 | 20,528,566 | -99,085 | | generators | 29,539,571 | 29,482,227 | -57,344 | | genshi_text | 20,681,094 | 21,016,576 | 335,482 | | genshi_xml | 21,943,832 | 22,484,114 | 540,282 | | go | 20,013,641 | 18,889,191 | -1,124,450 | | gunicorn | 30,099,553 | 29,992,667 | -106,886 | | hexiom | 18,173,561 | 18,145,475 | -28,086 | | html5lib | 37,427,687 | 41,330,590 | 3,902,903 | | json | 18,928,006 | 18,950,241 | 22,235 | | json_dumps | 17,250,986 | 17,288,630 | 37,644 | | json_loads | 17,275,172 | 17,305,600 | 30,428 | | logging_format | 24,717,994 | 24,727,942 | 9,948 | | logging_silent | 17,715,395 | 17,760,256 | 44,861 | | logging_simple | 23,280,688 | 23,272,691 | -7,997 | | mako | 26,486,881 | 26,608,396 | 121,515 | | mdp | 25,583,225 | 25,669,827 | 86,602 | | meteor_contest | 19,434,739 | 19,383,442 | -51,297 | | mypy2 | 167,246,116 | 242,955,605 | 75,709,489 | | nbody | 17,175,308 | 17,166,336 | -8,972 | | nqueens | 17,238,308 | 17,207,491 | -30,817 | | pathlib | 18,659,035 | 18,834,383 | 175,348 | | pickle | 17,929,362 | 18,064,140 | 134,778 | | pickle_dict | 17,877,869 | 17,895,228 | 17,359 | | pickle_list | 17,998,799 | 17,963,885 | -34,914 | | pickle_pure_python | 17,842,371 | 17,920,975 | 78,604 | | pidigits | 17,161,264 | 17,136,883 | -24,381 | | pprint_safe_repr | 37,872,591 | 37,859,718 | -12,873 | | pprint_pformat | 37,840,798 | 37,846,454 | 5,656 | | pycparser | 39,668,784 | 46,512,420 | 6,843,636 | | pyflate | 60,085,199 | 59,582,366 | -502,833 | | pylint | 92,368,310 | 95,753,752 | 3,385,442 | | python_startup | 13,590,332 | 13,637,729 | 47,397 | | python_startup_no_site | 13,584,871 | 13,620,175 | 35,304 | | raytrace | 17,892,108 | 17,909,662 | 17,554 | | regex_dna | 25,022,854 | 25,080,393 | 57,539 | | regex_effbot | 17,865,971 | 17,899,520 | 33,549 | | regex_v8 | 23,517,086 | 23,460,327 | -56,759 | | richards | 17,694,720 | 17,702,912 | 8,192 | | richards_super | 17,723,392 | 17,667,413 | -55,979 | | scimark_fft | 18,076,428 | 18,055,168 | -21,260 | | scimark_lu | 17,692,769 | 17,755,769 | 63,000 | | scimark_monte_carlo | 18,058,873 | 18,048,926 | -9,947 | | scimark_sparse_mat_mult | 18,079,353 | 18,216,667 | 137,314 | | spectral_norm | 16,931,693 | 17,008,932 | 77,239 | | sqlglot_normalize | 22,803,407 | 22,787,218 | -16,189 | | sqlglot_optimize | 23,573,845 | 23,881,045 | 307,200 | | sqlglot_parse | 22,958,665 | 25,704,350 | 2,745,685 | | sqlglot_transpile | 23,045,461 | 24,079,993 | 1,034,532 | | sqlite_synth | 19,792,457 | 19,782,509 | -9,948 | | sympy_expand | 59,473,139 | 59,392,975 | -80,164 | | sympy_integrate | 60,309,504 | 60,277,126 | -32,378 | | sympy_sum | 72,773,632 | 72,791,771 | 18,139 | | sympy_str | 60,266,203 | 60,242,602 | -23,601 | | telco | 17,062,180 | 17,114,453 | 52,273 | | thrift | 18,817,609 | 18,836,528 | 18,919 | | tomli_loads | 180,438,162 | 177,827,254 | -2,610,908 | | tornado_http | 41,430,649 | 41,497,356 | 66,707 | | typing_runtime_protocols | 19,174,156 | 19,157,772 | -16,384 | | unpack_sequence | 18,567,753 | 18,615,929 | 48,176 | | unpickle | 18,123,239 | 18,082,474 | -40,765 | | unpickle_list | 18,007,186 | 18,036,248 | 29,062 | | unpickle_pure_python | 17,914,343 | 17,938,529 | 24,186 | | xml_etree_parse | 22,025,362 | 22,069,443 | 44,081 | | xml_etree_iterparse | 22,826,227 | 22,949,107 | 122,880 | | xml_etree_generate | 23,017,569 | 23,610,514 | 592,945 | | xml_etree_process | 22,720,121 | 22,722,462 | 2,341 |

Looking at the mypy2 benchmark alone, I measured how memory usage increases as you increase the size of the input data (I basically feed it N copies of the input file rather than a single file).

On the legacy GC, memory usage scales pretty linearly with the size of the input. With the incremental GC, memory usage appears to grow faster than linear. but certainly at a much faster rate. The two lines are normalized so it's (memory of N files / memory of 1 file).

#files	incremental gc	legacy gc
1	144	136
2	149	137
5	155	138
10	192	142

Data size	Incremental GC	Legacy GC
1	109 MB	109 MB
2	118 MB	118 MB
4	126 MB	125 MB
8	187 MB	139 MB
16	244 MB	193 MB
24	335 MB	267 MB
32	435 MB	270 MB
48	619 MB	451 MB

faster-cpython / ideas

Investigation: Why do some benchmarks have significantly increased memory usage with the new incremental GC? #668