Clay-foundation / model

The Clay Foundation Model (in development)
https://clay-foundation.github.io/model/
Apache License 2.0
262 stars 30 forks source link

Memory buildup clean #158

Closed brunosan closed 5 months ago

brunosan commented 5 months ago

fixes #157.

Using malloc I tested that the biggest mem footprint buildup after n epochs is from matplotlib, which we use on wandb plots.

Just closing the plot frees most of the buildup.

1 epochs 
matplotlib/cbook.py:733: size=32.0 MiB, count=66, average=497 KiB
<frozen importlib._bootstrap_external>:729: size=2997 KiB, count=27469, average=112 B
matplotlib/transforms.py:198: size=1515 KiB, count=16624, average=93 B
matplotlib/lines.py:359: size=1336 KiB, count=1728, average=792 B

100 epochs before fix
matplotlib/cbook.py:733: size=1616 MiB, count=3333, average=497 KiB
matplotlib/transforms.py:198: size=74.7 MiB, count=839610, average=93 B
matplotlib/lines.py:359: size=65.9 MiB, count=87264, average=792 B
matplotlib/text.py:994: size=61.0 MiB, count=80800, average=792 B

100 epochs clearing "all" plots
matplotlib/cbook.py:733: size=64.0 MiB, count=132, average=497 KiB
matplotlib/transforms.py:198: size=3031 KiB, count=33252, average=93 B
<frozen importlib._bootstrap_external>:729: size=2995 KiB, count=27459, average=112 B
matplotlib/lines.py:359: size=2673 KiB, count=3456, average=792 B

100 epochs clearing fig plot
matplotlib/cbook.py:733: size=32.0 MiB, count=66, average=497 KiB
<frozen importlib._bootstrap_external>:729: size=2988 KiB, count=27324, average=112 B
matplotlib/transforms.py:198: size=1515 KiB, count=16626, average=93 B
matplotlib/lines.py:359: size=1336 KiB, count=1728, average=792 B

I left the malloc code commented out to make it easer in the future if needed.

yellowcap commented 5 months ago

Referencing also https://github.com/Clay-foundation/model/issues/121 since wandb is involved here