649453932 / Bert-Chinese-Text-Classification-Pytorch

使用Bert,ERNIE,进行中文文本分类
MIT License
4.01k stars 898 forks source link

数据量特别大的时候超出内存 #69

Open xuanxuangg68 opened 3 years ago

xuanxuangg68 commented 3 years ago

我大概有10亿条短文本约42GB,每次加载数据时都会超出内存,batch size和pad size也调过了都不行,请问有大佬有办法解决吗

lwhere commented 3 years ago

花钱买好显卡

AdamFocus commented 3 years ago

大佬问题解决了吗

xuanxuangg68 commented 3 years ago

只能批量传数据啦,得改代码

------------------ 原始邮件 ------------------ 发件人: AdamFocus <notifications@github.com> 发送时间: 2021年2月3日 17:09 收件人: 649453932/Bert-Chinese-Text-Classification-Pytorch <Bert-Chinese-Text-Classification-Pytorch@noreply.github.com> 抄送: xuanxuangg68 <865450008@qq.com>, Author <author@noreply.github.com> 主题: 回复:[649453932/Bert-Chinese-Text-Classification-Pytorch] 数据量特别大的时候超出内存 (#69)

AdamFocus commented 3 years ago

请问你当时报错是out of memory还是cudnn有问题呢

xuanxuangg68 commented 3 years ago

我是oom

------------------ 原始邮件 ------------------ 发件人: AdamFocus <notifications@github.com> 发送时间: 2021年2月3日 17:12 收件人: 649453932/Bert-Chinese-Text-Classification-Pytorch <Bert-Chinese-Text-Classification-Pytorch@noreply.github.com> 抄送: xuanxuangg68 <865450008@qq.com>, Author <author@noreply.github.com> 主题: 回复:[649453932/Bert-Chinese-Text-Classification-Pytorch] 数据量特别大的时候超出内存 (#69)

请问你当时报错是out of memory还是cudnn有问题呢

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

xuanxuangg68 commented 3 years ago

我八张v100-32这都不行也是没谁了

------------------ 原始邮件 ------------------ 发件人: 上下而求索 <notifications@github.com> 发送时间: 2021年1月29日 15:08 收件人: 649453932/Bert-Chinese-Text-Classification-Pytorch <Bert-Chinese-Text-Classification-Pytorch@noreply.github.com> 抄送: xuanxuangg68 <865450008@qq.com>, Author <author@noreply.github.com> 主题: 回复:[649453932/Bert-Chinese-Text-Classification-Pytorch] 数据量特别大的时候超出内存 (#69)

花钱买好显卡

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

AdamFocus commented 3 years ago

好的谢谢了,我这边本来还可以跑,然后突然报错cuDNN error: CUDNN_STATUS_NOT_INITIALIZED,也是醉了