pytorch / benchmark

TorchBench is a collection of open source benchmarks used to evaluate PyTorch performance.
BSD 3-Clause "New" or "Revised" License
865 stars 280 forks source link

SGD foreach with momentum x PT2 regression #1665

Open github-actions[bot] opened 1 year ago

github-actions[bot] commented 1 year ago

TorchBench CI has detected a performance signal or runtime regression.

Base PyTorch commit: 174d01bc939c7bdf390113c75d5ec2ce84cfa1d2

Affected PyTorch commit: 329bb2a33e40f4bc76b2e061b180d3234984c91b

Affected Tests:

Tests that were no longer run on affected commit:

Tests that were newly added on affected commit:

Runtime regressions found? No runtime errors were found in the new benchmarks run--you are all good there!

GitHub workflow that triggered this issue: https://github.com/pytorch/benchmark/actions/runs/5020694796

cc @janeyx99

janeyx99 commented 1 year ago

Total: 361 speedup: 96 | slowdown: 265 (pt2): 311 | eager: 50 <20%: 291 | >=20%: 70 SGD: 159 ... SGD with momentum: 152

doctr_reco_predictor, SGD, cuda, (pt2) foreach, momentum=0.9: +204.08757%

image

There seems to be significant slowdowns for SGD in general, esp with momentum...

janeyx99 commented 1 year ago

@mlazos was looking into this

janeyx99 commented 1 year ago

Didn't mean to close

janeyx99 commented 1 year ago

The possible commits that went into torch between the two commits are: image and image

not sure which one could have affected the momentum buffers in SGD

janeyx99 commented 1 year ago

From playing with Scuba, I've isolated that the regression is only for