torch / torch7

http://torch.ch
Other
8.96k stars 2.38k forks source link

Extremely slow cudnn import with cuda9 and cudnn7 on Volta #1193

Open ajhool opened 5 years ago

ajhool commented 5 years ago

This is a strange bug but I believe I've isolated it correctly. The entire script executes quickly and as expected, however, it takes 11 minutes to import cudnn when running the following command:

    #mycode.lua
    print('params.backend == cudnn, require cudnn')
    require 'cudnn'
    print('cudnn required')

Other packages (like nn) only take a split second to import. I am using the cudnn torch bindings found here: https://github.com/soumith/cudnn.torch

And am building it using docker:

FROM nvidia/cuda:9.0-cudnn7-devel-ubuntu16.04 as base
...
RUN git clone https://github.com/soumith/cudnn.torch.git -b R7 && cd cudnn.torch && \
    luarocks make cudnn-scm-1.rockspec

I'm not sure how to get any more insight into what's causing the package to load slowly.

ajhool commented 5 years ago

Update: cudnn attempts to configure the GPU on import and it is not properly configuring gpus with a Volta architecture. I'm not familiar with the torch gpu drivers, has anybody else run into this issue?

The init code is here:

https://github.com/soumith/cudnn.torch/blob/R7/init.lua