lengstrom / fast-style-transfer

TensorFlow CNN for fast style transfer ⚡🖥🎨🖼
10.91k stars 2.6k forks source link

failed to run cuBLAS routine: CUBLAS_STATUS_NOT_SUPPORTED #260

Open fratamico opened 3 years ago

fratamico commented 3 years ago

When running the style.py, I get the above error message. Full stack trace below.

I'm running the following: python style.py --style mypic.jpg --checkpoint-dir my_checkpoint_dir --content-weight 1.5e1 --checkpoint-iterations 1000 --batch-size 10

Versions:

I've tried allowing GPU memory growth setting the GPU memory fraction as specified here, but still get he same error. Has anyone else solved this?


If using Keras pass *_constraint arguments to layers.
UID: 50
2021-01-06 18:29:08.600763: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-01-06 18:43:17.614048: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2021-01-06 18:44:26.855463: E tensorflow/stream_executor/cuda/cuda_blas.cc:428] failed to run cuBLAS routine: CUBLAS_STATUS_NOT_SUPPORTED
2021-01-06 18:44:26.855496: E tensorflow/stream_executor/cuda/cuda_blas.cc:2301] Internal: failed BLAS call, see log for details
Traceback (most recent call last):
  File "/home/lauren/anaconda3/envs/tf-gpu/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1367, in _do_call
    return fn(*args)
  File "/home/lauren/anaconda3/envs/tf-gpu/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1352, in _run_fn
    target_list, run_metadata)
  File "/home/lauren/anaconda3/envs/tf-gpu/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1445, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InternalError: Blas xGEMMBatched launch failed : a.shape=[10,65536,64], b.shape=[10,65536,64], m=64, n=64, k=65536, batch_size=10
     [[{{node ArithmeticOptimizer/FoldTransposeIntoMatMul_MatMul}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "style.py", line 167, in <module>
    main()
  File "style.py", line 147, in main
    for preds, losses, i, epoch in optimize(*args, **kwargs):
  File "src/optimize.py", line 118, in optimize
    train_step.run(feed_dict=feed_dict)
  File "/home/lauren/anaconda3/envs/tf-gpu/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 2391, in run
    _run_using_default_session(self, feed_dict, self.graph, session)
  File "/home/lauren/anaconda3/envs/tf-gpu/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 5347, in _run_using_default_session
    session.run(operation, feed_dict)
  File "/home/lauren/anaconda3/envs/tf-gpu/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 960, in run
    run_metadata_ptr)
  File "/home/lauren/anaconda3/envs/tf-gpu/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1183, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/lauren/anaconda3/envs/tf-gpu/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1361, in _do_run
    run_metadata)
  File "/home/lauren/anaconda3/envs/tf-gpu/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1386, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Blas xGEMMBatched launch failed : a.shape=[10,65536,64], b.shape=[10,65536,64], m=64, n=64, k=65536, batch_size=10
     [[{{node ArithmeticOptimizer/FoldTransposeIntoMatMul_MatMul}}]]```
syu-tan commented 3 years ago

i solved its issue on same envs. maybe problem is 3090 . please try pip uninstall tensorflow and pip install tensorflow==2.4.1 it doesn't recognize the name of the GPU, but it works fine.