THUDM / ChatGLM3

ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型
Apache License 2.0
13.33k stars 1.55k forks source link

多次修正代码执行后,CUDA崩溃 #573

Closed HuChundong closed 9 months ago

HuChundong commented 9 months ago

System Info / 系統信息

+---------------------------------------------------------------------------------------+ | NVIDIA-SMI 546.01 Driver Version: 546.01 CUDA Version: 12.3 | |-----------------------------------------+----------------------+----------------------+ | GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA GeForce RTX 4090 WDDM | 00000000:01:00.0 On | Off | | 0% 42C P5 37W / 450W | 20450MiB / 24564MiB | 1% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+

Who can help? / 谁可以帮助到您?

No response

Information / 问题信息

Reproduction / 复现过程

用python画1到12月的温度变化曲线, 数据如下: 2.0, 4.9, 7.0, 23.2, 25.6, 76.7, 135.6, 162.2, 32.6, 20.0, 6.4, 3.3

执行结果:

ValueError Traceback (most recent call last) Cell In[4], line 25 22 plt.show() 24 # 调用函数 ---> 25 draw_temp_chart()

Cell In[4], line 11, in draw_temp_chart() 9 # 绘制温度变化曲线 10 for i in range(len(months)): ---> 11 ax.plot(months[:i+1], temperatures[i], label=f'{months[i]}') 13 # 设置标题和坐标轴标签 14 ax.set_title('Monthly Temperature Changes')

File ~.conda\envs\wxbot\lib\site-packages\matplotlib\axes_axes.py:1721, in Axes.plot(self, scalex, scaley, data, *args, kwargs) 1478 """ 1479 Plot y versus x as lines and/or markers. 1480 (...) 1718 ('green') or hex strings ('#008000'). 1719 """ 1720 kwargs = cbook.normalize_kwargs(kwargs, mlines.Line2D) -> 1721 lines = [self._get_lines(self, args, data=data, kwargs)] 1722 for line in lines: 1723 self.add_line(line)

File ~.conda\envs\wxbot\lib\site-packages\matplotlib\axes_base.py:303, in _process_plot_var_args.call(self, axes, data, *args, **kwargs) 301 this += args[0], 302 args = args[1:] --> 303 yield from self._plot_args( 304 axes, this, kwargs, ambiguous_fmt_datakey=ambiguous_fmt_datakey)

File ~.conda\envs\wxbot\lib\site-packages\matplotlib\axes_base.py:499, in _process_plot_var_args._plot_args(self, axes, tup, kwargs, return_kwargs, ambiguous_fmt_datakey) 496 axes.yaxis.update_units(y) 498 if x.shape[0] != y.shape[0]: --> 499 raise ValueError(f"x and y must have same first dimension, but " 500 f"have shapes {x.shape} and {y.shape}") 501 if x.ndim > 2 or y.ndim > 2: 502 raise ValueError(f"x and y can be no greater than 2D, but have " 503 f"shapes {x.shape} and {y.shape}")

ValueError: x and y must have same first dimension, but have shapes (2,) and (1,)

RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling cublasGemmEx( handle, opa, opb, m, n, k, &falpha, a, CUDA_R_16F, lda, b, CUDA_R_16F, ldb, &fbeta, c, CUDA_R_16F, ldc, CUDA_R_32F, CUBLAS_GEMM_DEFAULT_TENSOR_OP)

Expected behavior / 期待表现

修正代码正确执行,至少不崩溃

zRzRzRzRzRzRzR commented 9 months ago

这是代用了代码生成功能吗,执行了多少次

HuChundong commented 9 months ago

这是代用了代码生成功能吗,执行了多少次

是的,设置最大尝试次数5次,一般错误的代码比较多,就会出现上面的报错。

用python完成任务,绘制未来光伏发电规模和负荷预期的折线图。 年份是2023, 2024, 2025, 2026, 2027,光伏发电规模数据是 [1941.2, 2498, 2924, 3454, 4011],负荷预测数据是 [6731, 6924, 7165, 7396, 7578]

zRzRzRzRzRzRzR commented 9 months ago

了解,这个功能目前确实效果较差,你到第五次之前显存占用到多少了

HuChundong commented 9 months ago

了解,这个功能目前确实效果较差,你到第五次之前显存占用到多少了

不开量化的情况下,24个G,占用22个G,剩余2个G,我有跑一些其他的服务。

zRzRzRzRzRzRzR commented 9 months ago

如果在崩溃之前的代码你复制后自己都执行不出来大概率就是这模型本身也没法生成质量太好的代码