Closed wenlihaoyu closed 6 years ago
@wenlihaoyu I check out your code and it works well on Keras 2.2 + TF 1.9. Can you verify it on the latest TF(1.9 or 2.0)?
Thank you! I test it on Keras 2.08-2.2 + TF 1.3-1.6+cuda8.0,because the project involves other project applications, it cannot be upgraded to the latest version.
I define an OCR model, which is abbreviated as the following pattern "model.py":
from keras.layers import Input,Conv2D,MaxPooling2D
from keras.layers import Flatten,Permute,TimeDistributed,Dense
from keras.models import Model
def get_model(height,nclass):
rnnunit = 256
inputs = Input(shape=(height,None,1),name='the_input')
m = Conv2D(64,kernel_size=(3,3),activation='relu',padding='same',name='conv1')(inputs)
m = MaxPooling2D(pool_size=(2,2),strides=(2,2),name='pool1')(m)
m = Conv2D(128,kernel_size=(3,3),activation='relu',padding='same',name='conv2')(m)
m = MaxPooling2D(pool_size=(2,2),strides=(2,2),name='pool2')(m)
m = Permute((2,1,3),name='permute')(m)
m = TimeDistributed(Flatten(),name='timedistrib')(m)
y_pred = Dense(nclass,name='out',activation='softmax')(m)
basemodel = Model(inputs=inputs,outputs=y_pred)
return basemodel
When I run On CPU --> test.py:
import os
os.environ["CUDA_VISIBLE_DEVICES"] = ''
basemodel = get_model(32,5000)
for w in range(4,100):
try:
pred = basemodel.predict(np.zeros((1,32,w,1)))
except Exception as E:
print(i ,E)
The model run is Ok!
but run On GPU --> test.py:
basemodel = get_model(32,5000)
for w in range(4,100):
try:
pred = basemodel.predict(np.zeros((1,32,w,1)))
except Exception as E:
print(i ,E)
There w>=8 is error,the logs: InternalError: CUB segmented reduce errorinvalid configuration argument [[Node: out_1/Max = Max[T=DT_FLOAT, Tidx=DT_INT32, keep_dims=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](out_1/add, timedistrib_10/stack/0)]] [[Node: out_1/truediv/_175 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_79_out_1/truediv", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]