ModelParallel and Backend Improvements

Better ModelParallel code.
The cudnn and mkl backends can now be used at the same time with -backend cudnn,mkl or -backend mkl,cudnn (order doesn't matter). This can be useful for speeding up multi device usage that includes both a CPU and GPU.
If the CPU is not selected with the -gpu parameter, then CPU backends will be ignored.
Added support for the OpenMP backend, via -backend openmp.
Initial support for MKL-DNN, so that it can be easily added when it becomes available in the next PyTorch release.
Improved multi-device error messages.

ProGamerGov / neural-style-pt