jasperzhong / read-papers-and-code

My paper/code reading notes in Chinese
43 stars 3 forks source link

Book reading | Computer Systems: A Programmer's Perspective #64

Open jasperzhong opened 4 years ago

jasperzhong commented 4 years ago

老希望工程了

jasperzhong commented 4 years ago

ch5 optimizing program performance

很有用的一章. 介绍了很多优化技巧. 因为编译器其实就能做很多优化,书中介绍的基本是阻止编译器优化的微妙细节.

memory aliasing

void twiddle1(long *xp, long *yp) {
    *xp += *yp;
    *xp += *yp;
}

void twiddle2(long *xp, long *yp) {
    *xp += 2* *yp;
}

2比1更有效率,因为1需要6次访存,而2只需要3次(要知道一次访存的时候足够CPU执行上百条指令了);而且好像也没什么问题. 但是实际上如果xp和yp指针指向同一个地方,那么表现完全不同,1最后是变为原来4倍,2是变为原来3倍。。。所以编译器不会生成2的代码。

 g++ -S -O3 memory_aliasing.cc

image

可以看到,即便开了O3,还是没有优化成2那样子.

Among compilers, GCC is considered adequate, but not exceptional, in terms of its optimization capabilities. It performs basic optimizations, but it does not perform

yzh119 commented 4 years ago

cuda里面有__restrict__关键字:https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#restrict 表示指针不会指向重复的地方,不知道C++上有没有。

jasperzhong commented 4 years ago

cuda里面有__restrict__关键字:https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#restrict

这个其实C就有了,现在才知道是这个用法。@yzh119

jasperzhong commented 4 years ago

找到一篇关于__restrict__的文章,写得不错

http://assemblyrequired.crashworks.org/load-hit-stores-and-the-__restrict-keyword/