apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.76k stars 6.8k forks source link

Feature request: Enable or Disable MKL-DNN in MXNet via environment variable at module load time #11977

Open bhavinthaker opened 6 years ago

bhavinthaker commented 6 years ago

Feature request: Enable or Disable MKL-DNN in MXNet via environment variable at module load time

It would be helpful to have the capability to enable or disable MKL-DNN in MXNet via an environment variable at module load time so that customers can test if an operator's functional behavior or performance is due to the implementation in MKL-DNN or not. When the environment variable is off, the operator implemented in MXNet would be used and when the environment variable is on, the operator implemented in MKL-DNN would be used.

This request by reviewed and agreed to be a useful feature by Da Zheng, one of the MXNet contributors, who contributed to MKL-DNN support in MXNet.

As suggested by srochel below, the default behavior should be to have MKL enabled by default.

srochel commented 6 years ago

Please clarify default behavior - I would expect that MKL-DNN should be enabled by default.

marcoabreu commented 6 years ago

I think the idea is to basically have multiple versions of an operator available and then have the possibility to select the desired version during runtime.

The default behavior could be that we have an auto tuning stage which determins the performance of each version. It would then choose the most optimal graph without requiring to recompile everything.

pengzhao-intel commented 6 years ago

@bhavinthaker This is a good idea for the performance optimization.

As @srochel suggestions, the MKL-DNN is enabled by default, and you can switch it off to see the performance change. But switch all MKL-DNN on/off may don't have too much difference with the built binary w/ and w/o MKL-DNN .

One possible solution is to provide more fine level control (or auto-tuning), such as environment variable MXNET_MKLDNN_CONV_OFF=1, to turn on/off each MKL-DNN OP in the runtime. I think this aligns with @marcoabreu 's idea.

In practice, the new feature will introduce lots of changes and we need more unit and coverage tests. We have to discuss the detailed plan.

@zheng-da is there any impact for the subgraph if we change the OP in runtime?

@TaoLv @zhennanqin

TaoLv commented 6 years ago

Good idea! Seems we can use environment variables to make one or all mkldnn operators fall back to mxnet original implementation. This can be done simply by changing the SupportXXX function of mkldnn operator? I think this can help to do functionality or performance debug. But performance auto tuning is another topic which need more high level solution.

ZhennanQin commented 6 years ago

Yeah, this can be done by changing SupportMKLDNN function, and also can be done in FInferStorageType.

Roshrini commented 6 years ago

@sandeep-krishnamurthy Can you please add label: FeatureRequest

lupesko commented 5 years ago

This feature has been implemented, see here. @bhavinthaker - kindly close the issue, or comment if not addressing your suggestion.