Closed saselovejulie closed 7 years ago
@perhapszzy 如果有时间能帮忙解答下吗? 感激不尽!
报什么错?有详细的日志吗?
@saselovejulie
如果使用的是GPU版本的tensorflow,这个是会报错的,目前也没有特别好的方法来避免这个问题,一个可行的方法是通过docker
@ScorpioCPH 这是错误信息: Traceback (most recent call last): File "E:\tools\Python35\lib\site-packages\tensorflow\python\client\session.py", line 1139, in _do_call return fn(*args) File "E:\tools\Python35\lib\site-packages\tensorflow\python\client\session.py", line 1121, in _run_fn status, run_metadata) File "E:\tools\Python35\lib\contextlib.py", line 66, in exit next(self.gen) File "E:\tools\Python35\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 466, in raise_exception_on_not_ok_status pywrap_tensorflow.TF_GetCode(status)) tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed : a.shape=(5000, 784), b.shape=(784, 500), m=5000, n=500, k=784 [[Node: layer1/MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/gpu:0"](_arg_x-input_0_0/_11, layer1/weights/read)]] [[Node: Mean/_13 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_35_Mean", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "E:/wrokspace/self-project/TensorFlowDemo/mnist/mnist_optimize/mnist_eval.py", line 56, in
Caused by op 'layer1/MatMul', defined at:
File "E:/wrokspace/self-project/TensorFlowDemo/mnist/mnist_optimize/mnist_eval.py", line 56, in
InternalError (see above for traceback): Blas GEMM launch failed : a.shape=(5000, 784), b.shape=(784, 500), m=5000, n=500, k=784 [[Node: layer1/MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/gpu:0"](_arg_x-input_0_0/_11, layer1/weights/read)]] [[Node: Mean/_13 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_35_Mean", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
@perhapszzy 我查了一下资料是自己的问题, GPU同时启动2个在一个GPU运行会有问题. 谢谢你提供的方案, 有机会换到Linux试试 docker.
第五章的时候, 老师介绍了训练结果的保存, 同时mnist_eval利用训练结果进行测试集的验证, 但是我如果两个脚本同时启动就会报错. 启动任意一个没有问题, 请问是因为我的电脑只有一个GPU, 所以一起只能启动一个吗? 谢谢