CPU训练无法满载 - Githubissues

luyishisi / Anti-Anti-Spider

越来越多的网站具有反爬虫特性，有的用图片隐藏关键数据，有的使用反人类的验证码，建立反反爬虫的代码仓库，通过与不同特性的网站做斗争（无恶意）提高技术。（欢迎提交难以采集的网站）（因工作原因，项目暂停）

https://www.urlteam.cn

7.28k stars 2.17k forks source link

CPU训练无法满载 #3

Open ghost opened 7 years ago

ghost commented 7 years ago

24核心，48线程的2695 v2，始终无法满载。使用率30%左右。每次训练大约耗时1秒。

使用GPU（GTX960）的话，每次训练大约耗时0.5秒。

ubuntu 16.04，CUDA 8.0， cuDNN 5.1。tensorflow是本地编译的。

ghost commented 7 years ago

是否和优化有关？是否只用了一个线程生成图像，因而这里是瓶颈？

luyishisi commented 7 years ago

一方面线程瓶颈是有的，其次的看看该生成图像是否有存为本地文件，可能存在磁盘io瓶颈

ghost commented 7 years ago

第一是没有生成本地文件，第二，硬盘是intel 750 1.2T。根据intel的文档，磁盘应该有几十万IOPS，外加2GBps以上读速度，1GB以上的写速度。

另外iotop的实际读写为0。应该就单单是生成验证码太慢。

2017/03/31 11:19、luyishisi notifications@github.com のメッセージ:

一方面线程瓶颈是有的，其次的看看该生成图像是否有存为本地文件，可能存在磁盘io瓶颈

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

leng-yue commented 6 years ago

是否和优化有关？是否只用了一个线程生成图像，因而这里是瓶颈？

这里确实是和优化有关, 事实上在正常使用中不应该一边生成图片一边训练

kotori2 commented 5 years ago

我这里试了一下，生成batch用了254ms，GPU训练用了13ms。。。结果就是GPU完全空载，CPU六个核只有一个在慢慢生成训练批次

kotori2 commented 5 years ago

试图优化了一下，开了12个线程同时生成图片数据，最后可以做到160ms的训练数据生成速度，再多开线程好像影响不大