mindspore-lab / mindone

one for all, Optimal generator with No Exception
https://mindspore-lab.github.io/mindone/
Apache License 2.0
353 stars 67 forks source link

Improve normalization performance by vectorization #514

Closed ProgrammerPeter closed 4 months ago

ProgrammerPeter commented 4 months ago

What does this PR do?

Improve performance of data preprocess.

Fixes # (issue) Import normalization performance by vectorization.

Before:

normalize cost:  0.5943191051483154
normalize cost:  0.6660106182098389
normalize cost:  0.6163914203643799
normalize cost:  0.6558637619018555
normalize cost:  0.8292555809020996
normalize cost:  0.8465554714202881
normalize cost:  0.8598706722259521
normalize cost:  0.8413074016571045
normalize cost:  0.8623526096343994
normalize cost:  0.8590891361236572
normalize cost:  0.8666942119598389
normalize cost:  0.8659505844116211
normalize cost:  0.843127965927124
normalize cost:  0.898564338684082
normalize cost:  0.8245420455932617
normalize cost:  0.8166720867156982

After:

normalize cost:  0.2826392650604248
normalize cost:  0.2461528778076172
normalize cost:  0.27094507217407227
normalize cost:  0.3079872131347656
normalize cost:  0.23907780647277832
normalize cost:  0.20216584205627441
normalize cost:  0.27806663513183594
normalize cost:  0.3272981643676758
normalize cost:  0.22484040260314941
normalize cost:  0.30868077278137207
normalize cost:  0.28723835945129395
normalize cost:  0.21770191192626953
normalize cost:  0.31587648391723633
normalize cost:  0.28414392471313477
normalize cost:  0.327955961227417
normalize cost:  0.277529239654541

Adds # (feature) Nothing.

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.

@xxx

hadipash commented 4 months ago

The speed improvement comes not from vectorization, but from the precision of calculations. Initially, it is calculated in float64.

import timeit
import numpy as np

pixel_values = np.random.rand(256, 256, 3)

def pix_val_1():
    return (pixel_values / 127.5 - 1.0).astype(np.float32)

def pix_val_2():
    return pixel_values.astype(np.float32) / 127.5 - 1.0

def pix_val_3():
    return np.subtract(np.divide(pixel_values, 127.5, dtype=np.float32), 1.0, dtype=np.float32)

n = 50000
print(timeit.timeit(pix_val_1, number=n) / n)
print(timeit.timeit(pix_val_2, number=n) / n)
print(timeit.timeit(pix_val_3, number=n) / n)

Output:

0.0008699686280000605
0.0002086409739998635
0.00022953715799987548
zhtmike commented 4 months ago

The speed improvement comes not from vectorization, but from the precision of calculations. Initially, it is calculated in float64.

import timeit
import numpy as np

pixel_values = np.random.rand(256, 256, 3)

def pix_val_1():
    return (pixel_values / 127.5 - 1.0).astype(np.float32)

def pix_val_2():
    return pixel_values.astype(np.float32) / 127.5 - 1.0

def pix_val_3():
    return np.subtract(np.divide(pixel_values, 127.5, dtype=np.float32), 1.0, dtype=np.float32)

n = 50000
print(timeit.timeit(pix_val_1, number=n) / n)
print(timeit.timeit(pix_val_2, number=n) / n)
print(timeit.timeit(pix_val_3, number=n) / n)

Output:

0.0008699686280000605
0.0002086409739998635
0.00022953715799987548

nice finding