Add notebooks that estimate total FLOPs used by KataGo and us

ed1d1a8d commented 1 year ago

The number of FLOPs used by KataGo is estimated in notebooks/iclr2022/estimate-flops-katago.ipynb. We perform the estimate by:

Counting the number of data rows for each model as specified on https://katagotraining.org/networks/.
For each model, we count the number of FLOPs used in one forward pass of the model by using the ptflops and thop libraries. We use the newly written pytorch version of KataGo for this (added as a submodule).

The number of FLOPs used to train our adversary is estimated in notebooks/iclr2022/estimate-flops-adv.ipynb. We perform this estimate by:

Borrowing the model FLOPs computations from the previous notebook.
Parsing all games, and for each game computing how many FLOPs it took to run that game.

We assume the selfplay/victimplay dominates the total compute.

Updated compute estimates:

KataGo Latest took 8.27 x 10^22 FLOPs to train.
Our pass adversary took 1.13 x 10^20 FLOPs to train (0.14% of KataGo's compute)
Our cyclic adversary (paper version) took 1.13 x 10^21 FLOPs to train (1.37% of KataGo's compute)
Our cyclic adversary (super long run; eval'd against 1mil visit victim) took 9.58 x 10^21 FLOPs to train (11.6% of KataGo's compute).

review-notebook-app[bot] commented 1 year ago

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

netlify[bot] commented 1 year ago

Deploy Preview for goattack canceled.

Name	Link
Latest commit	27d0fc4a23c4eebf7ee073906e0d0586ff21d064
Latest deploy log	https://app.netlify.com/sites/goattack/deploys/63b758f2abd5af0009923293

AdamGleave commented 1 year ago

Thanks for working on this Tony, great to have a more rigorous eval of how much compute we used! Left a number of minor comments / clarifying questions. Please request re-review once addressed.

AdamGleave commented 1 year ago

LGTM apart from the question about ratio of rows:moves not being 2, and the # of visits used during training. Both of those could change the numbers to a non-trivial degree, but if it looks like resolving them will end up being more of a research problem we might want to merge this PR early and just open another ticket to address it.

ed1d1a8d commented 1 year ago

So I talked with lightvector and he gave a lot of suggested improvements to our estimation procedure. In particular, he recommended we benchmark the actual selfplay/victimplay process to determine the ratio between data rows and total number of moves played (this might explain our variation from 1.48 to 2.22).

Doing the improved estimation factoring in lightvector's suggestions will be fairly involved, so I made a new issue for it here: https://github.com/HumanCompatibleAI/KataGoVisualizer/issues/41. I agree that we can merge this PR and address this issue later.

Fixed all the other small things and re-requesting review.

AlignmentResearch / KataGoVisualizer