Closed gangliao closed 6 years ago
https://cmake.org/cmake/help/v3.0/module/ExternalProject.html
Paddle的依赖glog, glags, gtest, zlib等等也完全可以这样(已经在Mac和ubuntu上面测试过): https://github.com/gangliao/CodeCoverageCMake/blob/master/cmake/third_party.cmake
举个例子:
ExternalProject_Add(gflags
PREFIX ${gflags_PREFIX}
GIT_REPOSITORY "https://github.com/gflags/gflags.git"
GIT_TAG "v2.1.2"
UPDATE_COMMAND ""
INSTALL_DIR ${gflags_INSTALL}
CMAKE_ARGS -DCMAKE_BUILD_TYPE=${CMAKE_BUILD_TYPE}
-DCMAKE_INSTALL_PREFIX=${gflags_INSTALL}
-DBUILD_SHARED_LIBS=OFF
-DBUILD_STATIC_LIBS=ON
-DBUILD_PACKAGING=OFF
-DBUILD_TESTING=OFF
-DBUILD_NC_TESTS=OFF
-BUILD_CONFIG_TESTS=OFF
-DINSTALL_HEADERS=ON
-DCMAKE_C_FLAGS=${GFLAGS_C_FLAGS}
-DCMAKE_CXX_FLAGS=${GFLAGS_CXX_FLAGS}
LOG_DOWNLOAD 1
LOG_INSTALL 1
)
Many thanks to @gangliao 's exhaustive information collection and comparison between cmake and Bazel.
It looks to me that even if we base Paddle's build rules on top of all those works done by Tensorflow and Bazel team, we would still have a complex configuration and building system.
Given that the original motivation for us to consider Bazel as a candidate is that we want fine-grained building rules for Paddle source code, where fine-grain means that we want to build each pair of xxx.h and xxx.cc in a rule (of a library), and the corresponding xxxx_test.cc in another rule (of a unit test). Can we do this using cmake?
-- It seems that we can. Could @gangliao please have some examples on this, so we could confidently abandon Bazel and continue with cmake? Thanks!
我又一个关于bazel的问题: 当bazel依赖third party library时 如何定义规则从internet下载library? 似乎要指定下载地址以及sha256...... 但是这个信息怎么获得呢? 我本来以为会有类似maven一样的mirror可以获取这些信息,之后copy过来就可以了,然而发现似乎没有.
native.new_http_archive(
name = "highwayhash",
urls = [ ====>本来以为bazel-mirror.storage.googleapis.com是mirror, 结果发现...似乎没法访问
"http://bazel-mirror.storage.googleapis.com/github.com/google/highwayhash/archive/dfcb97ca4fe9277bf9dc1802dd979b071896453b.tar.gz",
"https://github.com/google/highwayhash/archive/dfcb97ca4fe9277bf9dc1802dd979b071896453b.tar.gz",
],
sha256 = "0f30a15b1566d93f146c8d149878a06e91d9bb7ec2cfd76906df62a82be4aac9",
strip_prefix = "highwayhash-dfcb97ca4fe9277bf9dc1802dd979b071896453b",
build_file = str(Label("//third_party:highwayhash.BUILD")),
)
不知道你有什么想法呢?
都是从第三依赖库的release或者tag中找到。。
what about sha256 = "0f30a15b1566d93f146c8d149878a06e91d9bb7ec2cfd76906df62a82be4aac9"
Paddle 选择使用cmake作为编译工程脚本,因为cmake也支持下载依赖编译。这个issue先close掉了。
I have a different concern about bazel. Recently, I build tensorflow via bazel, it's extremely fast. I change my mind about bazel.
I found many people consult me about bazel vs cmake after they retrieved this issue. Hopeful, this post can help these guys make technical decision. I don't want to bias adjustment.
Tensor Flow Total build time:
INFO: Elapsed time: 200.266s, Critical Path: 189.28s
@gangliao 请问你测tf的机器型号是什么呢?也是编译GPU版本么?paddle目前在teamcity机器上的编译差不多要13分钟。
@luotao1 是的,我编译tf with MPI + MKL + GPU + LLVM, etc...
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 40
On-line CPU(s) list: 0-39
Thread(s) per core: 2
Core(s) per socket: 10
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 79
Model name: Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz
Stepping: 1
CPU MHz: 2200.056
BogoMIPS: 4405.65
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 25600K
NUMA node0 CPU(s): 0-9,20-29
NUMA node1 CPU(s): 10-19,30-39
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 381.09 Driver Version: 381.09 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 TITAN Xp Off | 0000:04:00.0 Off | N/A |
| 23% 28C P8 8W / 250W | 0MiB / 12189MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 TITAN Xp Off | 0000:06:00.0 Off | N/A |
| 23% 30C P8 8W / 250W | 0MiB / 12189MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 TITAN Xp Off | 0000:07:00.0 Off | N/A |
| 23% 27C P8 9W / 250W | 0MiB / 12189MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 TITAN Xp Off | 0000:08:00.0 Off | N/A |
| 23% 30C P8 9W / 250W | 0MiB / 12189MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 4 TITAN Xp Off | 0000:0C:00.0 Off | N/A |
| 23% 24C P8 8W / 250W | 0MiB / 12189MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 5 TITAN Xp Off | 0000:0D:00.0 Off | N/A |
| 23% 24C P8 8W / 250W | 0MiB / 12189MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 6 TITAN Xp Off | 0000:0E:00.0 Off | N/A |
| 23% 26C P8 9W / 250W | 0MiB / 12189MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 7 TITAN Xp Off | 0000:0F:00.0 Off | N/A |
| 23% 29C P8 9W / 250W | 0MiB / 12189MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
@gangliao @luotao1 编译速度也取决于机器有多少物理的CPU core, @gangliao 的机器有40 CPU core呢,teamcity的机器应该都是是12core。计算能力约是3.3倍,如果这么算,paddle在40 core的机器上理论上应该也有4 min可以编译完成的速度。
embarrassingly parallel的情况下,应该是这样子
paddle在40 core的机器上理论上应该也有4 min可以编译完成的速度
paddle目前应该做不到。
@gangliao 请问bazel跟cmake在编译速度上有本质的区别吗?paddle最近的CI越来越慢了,我在想如果完全改成bazel,假设代码模块之间依赖关系不变的情况下,会不会有非常可观的速度提升?
we use cmake now
@PaddlePaddle/paddle
这几天重新研究了一下bazel,发现了一些我们可能没有仔细考虑过的问题,以及切换成bazel的代价。
对比CMake与Bazel:
优点: 稳定可靠,灵活性强,语法通熟易懂,甚至不需要学习都能写(我是这种情况)。但我认为最大的优点是:CMake是目前主流的C++开源项目编译工具,社区庞大,几乎任何你想实现的功能,github都有搜到一堆项目可以参考。
缺点:过于灵活,自定义性强,源码不同模块之间的相互依赖关系表述不是特别清晰。
优点:Bazel定义了很多规则(rules) 来简化引用外部依赖,比如git repo, http archive rules等等,同时通过rules清晰的定义的不同模块之间的依赖关系。
支持分布式编译: 其实从很大程度上,对于tensorflow和paddle来说,这并不是一个优点,因为即使是laptop,计算能力也很强,编译速度完全可以接受,我的Mac本地3分钟左右,即使是travis ci编译也只需要7分钟而已。
缺点:学习成本太高, 虽然大部分target通过内置的rules:
cc_library
,cc_binary
,new_repository
,new_http_archive
就可以完成。但是由于tensorflow和paddle这种框架都需要支持不同的操作系统,不同的编译器,不同的异构硬件,甚至各类分布式平台。这样的自定义非常复杂,bazel目前是通过增加crosstool文件和bzl文件,来进行支持的。通过查看bazel官方文档可以发现,目前bazel完全是被tensorflow推着走(最近bazel的更新,反而都是来自tensorflow的需求)。
即使是这样,tensorflow也没有做到只用bazel搞定一切,tensorflow还包含大量的cmake文件,甚至tensorflow on windows就是用cmake来做的。
个人觉得,目前bazel还不算太成熟,自定义的bzl也需要写很多 python 代码,而且可读性不强。
Build代码分析
CMake
cmake 代码分为两个部分:
cmake目录下的代码
: 包括Paddle的依赖库的自动查找,编译参数配置等等。 以下是代码行数(大部分自动查找代码来自其他项目的参考):各源文件目录模块下的代码
: 包括源码生成动态或者静态库以及二进制等等。 以下是代码行数:综合两部分:我们本身写了2000行左右就完成了PaddlePaddle的编译任务。
tensorflow
tensorflow首先通过一个
configure
文件导入各种环境变量给bazel使用。之后主要通过
workspace
,BUILD
和bzl
文件完成编译任务。综合几个部分:Tensorflow有接近30000行相关的编译代码。
网上一些人的意见
As you’ve noticed, there are some platforms that Bazel doesn’t currently serve. For example, my colleague Pete Warden has written Makefiles that help to cross-compile TensorFlow for iOS. Aurélien Géron submitted CMake configuration files, and I’m currently adapting these to build TensorFlow on Windows. These builds could be a starting point for platforms that Bazel doesn’t support.
In the longer term, though, we’d prefer to consolidate these into a single cross-platform build, and the Bazel team are actively adding more features (such as Windows support) that should enable this soon.