Build with mkldnn support alongside cuda

Comment:

Hi,

I'm currently working on torch code that should run on both cpu and cuda backends. Specifically, I'm using bfloat16 which requires the mkldnn backend to be performant on cpu. Is it possible to ship the cuda build with mkldnn turned on?

https://github.com/conda-forge/pytorch-cpu-feedstock/blob/9e99e0322b67ac080cb315cd1bb2b9b9e6bb9f15/recipe/build.sh#L121-L123 The above comment seems to suggest that USE_MKLDNN=1 should in theory be usable with cuda, but some compilation error led to a temporary workaround.

Thanks.

conda-forge / pytorch-cpu-feedstock

Build with mkldnn support alongside cuda #233

Comment: