allenai / document-qa

Apache License 2.0
434 stars 122 forks source link

Converting models in Tensorflow 1.6 #33

Open babych-d opened 6 years ago

babych-d commented 6 years ago

I'm using tensorflow 1.6 due to newer version of CUDA. After changing few imports the code is working fine, except transferring trained models to cpu. There is no RNNParamsSaveable class in 1.6 version, so I changed it to CudnnGRUSaveable with another parameters: params_saveable = cudnn_rnn_ops.CudnnGRUSaveable(fw_params, 1, dim, 400, direction="bidirectional") But I'm getting an error: 2018-05-22 13:45:03.151296: F tensorflow/contrib/cudnn_rnn/kernels/cudnn_rnn_ops.cc:203] Check failed: offset + size <= device_memory.size() The slice is not within the region of DeviceMemory. I have GeForce GTX 1080 Ti, 12 Gb, so probably I have enough memory, but still doing something wrong. Just wondering maybe somebody knows how to resolve this problem.

chrisc36 commented 6 years ago

My guess is that is due to a incompatibility with our CuDNN usage (partly my fault, the use of Cudnn is a bit hacky).

One way you could fix it would be to take the CPU-converted model (which should be loadable in tf 1.6), and the figure out how to convert that model back to the tensorflow 1.6 cudnn format.

babych-d commented 6 years ago

What do you mean by convert CPU model back to cudnn format? I already have models in cudnn format, my goal is to make them working on CPU.

chrisc36 commented 6 years ago

I see, you could convert the model using tensorflow 1.2, and then you can probably use them in tensorflow 1.6.

chrisc36 commented 6 years ago

If you have model trained in 1.6, then I am not sure what else to add. That bug looks like there was an error building graph for the cudnn version, but I am not sure why that would happen in the cpu convert script but not in other cases.

babych-d commented 6 years ago

Thank you for answer