Open baltam opened 1 year ago
用cpu跑通demo了,但是用gpu跑不了,求救!!!
操作系统:Windows 10 硬件配置环境
显卡:3070ti 8g 处理器:i9 7980xe cuda 10.0 cudnn 7.6.5 内存 64g
软件依赖
pandas==0.24.2 regex==2019.4.14 h5py==2.9.0 numpy==1.16.2 tensorboard==1.13.1 tensorflow-gpu==1.13.1 tqdm==4.31.1 requests==2.22.0 protobuf==3.19.0
模型加载好啦!🍺Bilibili干杯🍺 现在将你的作文题精简为一个句子,粘贴到这里:⬇️,然后回车 **********************************************作文题目********************************************** 苦练本手,方能妙手随成 **********************************************作文题目********************************************** 正在生成第 1 of 1 篇文章 ...... EssayKiller正在飞速写作中,请稍后...... 2022-11-27 19:19:37.206277: E tensorflow/stream_executor/cuda/cuda_blas.cc:428] failed to run cuBLAS routine: CUBLAS_STATUS_EXECUTION_FAILED 2022-11-27 19:19:37.206746: E tensorflow/stream_executor/cuda/cuda_blas.cc:2301] Internal: failed BLAS call, see log for details Traceback (most recent call last): File "C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\client\session.py", line 1365, in _do_call return fn(*args) File "C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\client\session.py", line 1350, in _run_fn target_list, run_metadata) File "C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\client\session.py", line 1443, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found. (0) Internal: Blas xGEMMBatched launch failed : a.shape=[24,11,64], b.shape=[24,11,64], m=11, n=11, k=64, batch_size=24 [[{{node sample_sequence/newslm/layer00/MatMul}}]] [[sample_sequence/while/Identity/_1594]] (1) Internal: Blas xGEMMBatched launch failed : a.shape=[24,11,64], b.shape=[24,11,64], m=11, n=11, k=64, batch_size=24 [[{{node sample_sequence/newslm/layer00/MatMul}}]] 0 successful operations. 0 derived errors ignored. During handling of the above exception, another exception occurred: Traceback (most recent call last): File "d:/ly/EssayKiller_V2-master/LanguageNetwork/GPT2/scripts/demo.py", line 220, in <module> p_for_topp: top_p[chunk_i]}) File "C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\client\session.py", line 956, in run run_metadata_ptr) File "C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\client\session.py", line 1180, in _run feed_dict_tensor, options, run_metadata) File "C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\client\session.py", line 1359, in _do_run run_metadata) File "C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\client\session.py", line 1384, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found. (0) Internal: Blas xGEMMBatched launch failed : a.shape=[24,11,64], b.shape=[24,11,64], m=11, n=11, k=64, batch_size=24 [[node sample_sequence/newslm/layer00/MatMul (defined at C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\framework\ops.py:1748) ]] [[sample_sequence/while/Identity/_1594]] (1) Internal: Blas xGEMMBatched launch failed : a.shape=[24,11,64], b.shape=[24,11,64], m=11, n=11, k=64, batch_size=24 [[node sample_sequence/newslm/layer00/MatMul (defined at C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\framework\ops.py:1748) ]] 0 successful operations. 0 derived errors ignored. Original stack trace for 'sample_sequence/newslm/layer00/MatMul': File "d:/ly/EssayKiller_V2-master/LanguageNetwork/GPT2/scripts/demo.py", line 188, in <module> do_topk=False) File "d:\ly\EssayKiller_V2-master\LanguageNetwork\GPT2\scripts\modeling.py", line 768, in sample do_topk=do_topk) File "d:\ly\EssayKiller_V2-master\LanguageNetwork\GPT2\scripts\modeling.py", line 740, in initialize_from_context batch_size=batch_size, p_for_topp=p_for_topp, cache=None, do_topk=do_topk) File "d:\ly\EssayKiller_V2-master\LanguageNetwork\GPT2\scripts\modeling.py", line 714, in sample_step cache=cache, File "d:\ly\EssayKiller_V2-master\LanguageNetwork\GPT2\scripts\modeling.py", line 499, in __init__ cache=layer_cache, File "d:\ly\EssayKiller_V2-master\LanguageNetwork\GPT2\scripts\modeling.py", line 198, in attention_layer attention_scores = tf.matmul(query, key, transpose_b=True) File "C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\util\dispatch.py", line 180, in wrapper return target(*args, **kwargs) File "C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\ops\math_ops.py", line 2716, in matmul return batch_mat_mul_fn(a, b, adj_x=adjoint_a, adj_y=adjoint_b, name=name) File "C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\ops\gen_math_ops.py", line 1712, in batch_mat_mul_v2 "BatchMatMulV2", x=x, y=y, adj_x=adj_x, adj_y=adj_y, name=name) File "C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\framework\op_def_library.py", line 794, in _apply_op_helper op_def=op_def) File "C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\util\deprecation.py", line 507, in new_func return func(*args, **kwargs) File "C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3357, in create_op attrs, op_def, compute_device) File "C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3426, in _create_op_internal op_def=op_def) File "C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\framework\ops.py", line 1748, in __init__ self._traceback = tf_stack.extract_stack()
(0) Internal: Blas xGEMMBatched launch failed : a.shape=[24,11,64], b.shape=[24,11,64], m=11, n=11, k=64, batch_size=24
通过bing搜索报错信息,得知了报错原因,主要是因为显存不够造成的
既然显存不够,那就减少一些显存,让程序灵活调用显存,这样问题就解决了吧,于是我加入了如下语句
os.environ["CUDA_VISIBLE_DEVICES"] = "0" tf_config = tf.compat.v1.ConfigProto(allow_soft_placement=True) tf_config.gpu_options.allow_growth=True # tf_config.gpu_options.per_process_gpu_memory_fraction = 0.6
...可还是报错,是因为显存不够吗...
既然gpu跑不了,那干脆不用gpu了,用cpu试试,于是我修改了以下语句
os.environ["CUDA_VISIBLE_DEVICES"] = " " #将0改为none
结果:程序跑通了,但是cpu跑肯定比gpu慢很多,跑一篇作文大概要10min,cpu占用率大概为40-50
这是来自QQ邮箱的假期自动回复邮件。 你好,我最近正在登月中,无法亲自回复你的邮件。我将在飞船返航后,尽快给你回复。
用cpu跑通demo了,但是用gpu跑不了,求救!!!
1.机器配置
操作系统:Windows 10 硬件配置环境
软件依赖
2.报错、解决思路、替代方案
2.1 关键信息抽取
2.2 问题分析
通过bing搜索报错信息,得知了报错原因,主要是因为显存不够造成的
2.3 想法1
既然显存不够,那就减少一些显存,让程序灵活调用显存,这样问题就解决了吧,于是我加入了如下语句
...可还是报错,是因为显存不够吗...
替代方案1
既然gpu跑不了,那干脆不用gpu了,用cpu试试,于是我修改了以下语句
结果:程序跑通了,但是cpu跑肯定比gpu慢很多,跑一篇作文大概要10min,cpu占用率大概为40-50