brucefan1983 / CUDA-Programming

Sample codes for my CUDA programming book
GNU General Public License v3.0
1.51k stars 316 forks source link

GeForce RTX2070或者RTX2080系列,双精度峰值很低,这个原因是什么? #10

Closed jackyh closed 2 years ago

brucefan1983 commented 3 years ago

GeForce系列的GPU的双精度浮点数峰值只有单精度的1/32。这只是一种设计的选择,并没有什么深刻的原因。也许,更多的是营销方面的原因,因为这样就更突出Tesla系列的GPU 在科学计算方面的优势。

On Thu, May 27, 2021 at 11:07 AM jackyh @.***> wrote:

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/brucefan1983/CUDA-Programming/issues/10, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF546OPGPRAQHTLKEOX5VLDTPWZPRANCNFSM45TJWNKQ .

Luo-Chang commented 2 years ago

参加英伟达的《CUDA C++ Programming Guide》Compute Capability 7.x 的脚注部分,计算能力为7.5的卡,也就是RTX2070 RTX2080系列,每个SM中只有两个双精度计算单元(同级别Tesla卡是32个)。如楼上作者所说,应该是为了市场差异化而做的设计选择

原文:2 FP64 cores for double-precision arithmetic operations for devices of compute capabilities 7.5