intel-analytics / analytics-zoo

Distributed Tensorflow, Keras and PyTorch on Apache Spark/Flink & Ray
https://analytics-zoo.readthedocs.io/
Apache License 2.0
18 stars 4 forks source link

Softmax poor performance #21

Closed emartinezs44 closed 2 years ago

emartinezs44 commented 3 years ago

The com.intel.analytics.zoo.pipeline.api.keras.layers.SoftMax layer shows a poor performance in comparison with other frameworks using the same hardware. In essence the operation from TensorNumeric vexp . The tests that I have done with an input Tensor[Float] with dims as Array(1, 12, 512, 512) and Storage of 3145728 length it takes a mean of 0.65 seg per inference. As a result, models with attention that use this function in each layer to calculate the attention vector take so long. The Softmax layer in Torch is much faster in the same hardware. Any idea about this? how much faster is Mkl-dnn?

yangw1234 commented 3 years ago

@emartinezs44 thank you for reporting this issue. We are looking into it.

yangw1234 commented 3 years ago

@dding3 could you help take a look at this?

jason-dai commented 2 years ago

@emartinezs44 This should have been fixed by https://github.com/intel-analytics/analytics-zoo/pull/5076; you may verify the result in the latest nightly build.

emartinezs44 commented 2 years ago

Perfect. I´ll check it out this evening. Thanks!

emartinezs44 commented 2 years ago

My tests show that it is ~3.4 times faster. Good work!. I close the issue. Thanks.