kaldi-asr / kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.
http://kaldi-asr.org
Other
14.11k stars 5.31k forks source link

Build system #3086

Open danpovey opened 5 years ago

danpovey commented 5 years ago

For the new version of Kaldi, does anyone think we should switch to a different build system, such as cmake? We should probably still have manually-run scripts that check the dependencies; I am just wondering whether the stuff we are doing in the 'configure' script would be better done with cmake, and if so, whether anyone is interested in making a prototype to at least let it compile on Linux.

kkm000 commented 5 years ago

I'll check if build with gold faster. But generally, I've never been troubled by the build speed. 10% performance translates into 1 hour of churn out of 10. The difference in build speed will certainly be less than an hour. And extra 15 minutes of reading the manual while the whole rig compiles will save more time than poking around.

What would be a setting where build speed a concern? Maybe I just do not understand what a common scenario is?

danpovey commented 5 years ago

I just mean for people who are new to Kaldi and may give up before it compiles

On Sat, Apr 6, 2019 at 5:28 AM kkm (aka Kirill Katsnelson) < notifications@github.com> wrote:

I'll check if build with gold faster. But generally, I've never been troubled by the build speed. 10% performance translates into 1 hour of churn out of 10. The difference in build speed will certainly take less than an hour. And extra 15 minutes of reading the manual while the whole rig compiles will save more time than poking around.

What would be a setting where build speed a concern? Maybe I just do not understand what a common scenario is?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/issues/3086#issuecomment-480500196, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJVu1XoZ_MacXCeL6vz3aas7yeeeMhkks5veJLxgaJpZM4bnduF .

danpovey commented 5 years ago

... but I'm primiarily talking here about what default setting to put in configure (e.g. --shared), not about changing the whole build system to slightly improve build speed.

On Sat, Apr 6, 2019 at 9:50 AM Daniel Povey dpovey@gmail.com wrote:

I just mean for people who are new to Kaldi and may give up before it compiles

On Sat, Apr 6, 2019 at 5:28 AM kkm (aka Kirill Katsnelson) < notifications@github.com> wrote:

I'll check if build with gold faster. But generally, I've never been troubled by the build speed. 10% performance translates into 1 hour of churn out of 10. The difference in build speed will certainly take less than an hour. And extra 15 minutes of reading the manual while the whole rig compiles will save more time than poking around.

What would be a setting where build speed a concern? Maybe I just do not understand what a common scenario is?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/issues/3086#issuecomment-480500196, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJVu1XoZ_MacXCeL6vz3aas7yeeeMhkks5veJLxgaJpZM4bnduF .

kkm000 commented 5 years ago

It is interesting that I am getting no different build times at all, static or shared. Also, both compilations always use -fPIC.

I compiled both runs with dynamic MKL, using mostly defaults (using the MKL branch, in PR currently) and not counting OpenFST (pre-build both flavors), not make depend; so only compile and build libs and binaries. Cleaned up quite thoroughly between builds to avoid any contamination.

Static:

./configure --cudatk-dir=/opt/nvidia/cuda-10.0 --static --static-math=no ( time make --output-sync=target -j16 ) &> default-static.log

I got 485 executables, size 13051 MB; and real/user time ≈ 5/60 minutes.

Shared:

./configure --cudatk-dir=/opt/nvidia/cuda-10.0 --shared --static-math=no Same make command, only different log. Now 485 executables take up 1403MB, and 22 DLLs 316 MB. Time is also ≈ 5/60 minutes.

So while the total size of generated final files is significantly less (7.5 times, 13 vs 1.7GB), the total and CPU time are pretty much same. Maybe my setup is different than that of most people? I'm running 16 builds on 16 physical cores; the machine is more powerful than average but I've seen much bigger ones. And the disks in it are not even M.2; they are just plain boring SATA SSDs (I kept them during the last upgrade because ML tasks are rarely disk bound, comparing M.2 vs SSD speeds). So I would say it's if above average for a good workstation, then not by much.

At the same time, I hear in this thread from at least @galv and @danpovey that shared build is faster (Speaking of the user entry simplicity, rebuilds probably do not count--it's now ensuring that the initial build is not insanely long and reasonably efficient). I am trying to assign at least some numeric values to "insane" and "efficient".

Maybe I have too many cores? I could try -j8 or even -j4. But I think i'll get less-than-linear increase in build time, from what I see in the CPU load graph.

And why do we -fPIC in static builds, and also for binaries. Probably should not, or is there a reason?

If any of these two things do not have a ready explanation, I'll believe I must fork off another thread. I'd [refer to focus on the question of cmake or not, an entirely different topic. And I totally want to kill the threaded math options, too.

danpovey commented 5 years ago

OK, that's interesting. Yeah, we should't be using -fPIC for static build Thanks for looking at the build system, it needed some attention.

On Sat, Apr 6, 2019 at 9:47 PM kkm (aka Kirill Katsnelson) < notifications@github.com> wrote:

It is interesting that I am getting no different build times at all, static or shared. Also, both compilations always use -fPIC.

I compiled both runs with dynamic MKL, using mostly defaults (using the MKL branch, in PR currently) and not counting OpenFST (pre-build both flavors), not make depend; so only compile and build libs and binaries. Cleaned up quite thoroughly between builds to avoid any contamination. Static:

./configure --cudatk-dir=/opt/nvidia/cuda-10.0 --static --static-math=no ( time make --output-sync=target -j16 ) &> default-static.log

I got 485 executables, size 13051 MB; and real/user time ≈ 5/60 minutes. Shared:

./configure --cudatk-dir=/opt/nvidia/cuda-10.0 --shared --static-math=no Same make command, only different log. Now 485 executables take up 1403MB, and 22 DLLs 316 MB. Time is also ≈ 5/60 minutes.

So while the total size of generated final files is significantly less (7.5 times, 13 vs 1.7GB), the total and CPU time are pretty much same. Maybe my setup is different than most peoples? I'm running 16 builds on 16 physical cores; the machine is more powerful than average but I've seen much bigger ones. And the disks in it are not even M.2; they are just plain boring SATA SSDs (I kept them during the last upgrade because ML tasks are rarely disk bound, comparing M.2 vs SSD speeds). So I would say it's if above average for a good workstation, then not by much.

At the same time, I hear in this thread from at least @galv https://github.com/galv and @danpovey https://github.com/danpovey that shared build is faster (Speaking of the user entry simplicity, rebuilds probably do not count--it's now ensuring that the initial build is not insanely long and reasonably efficient). I am trying to assign at least some numeric values to "insane" and "efficient".

Maybe I have too many cores? I could try -j8 or even -j4. But I think i'll get less-than-linear increase in build time, from what I see in the CPU load graph.

And why to we -fPIC in static builds. And for binaries. Probably should not, or is there a reason?

If anything of these two things do not have a ready explanation, I'll believe I must fork off another thread. I'd [refer to focus on the question of cmake or not, an entirely different topic. And I totally want to kill the threaded math options, too.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/issues/3086#issuecomment-480559003, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJVu7yOH5Nq8K0OMHx7V7YJh150lQ2_ks5veXhYgaJpZM4bnduF .

kkm000 commented 5 years ago

I'll look. The removing of the multithreaded math from configure is on my plate too, it must be cleaner w.r.t. build options. It does not add -fpic by itself, but there are varying templates from makefiles/. I can run on a few linuxes under Docker, the only thing I do not have permanent access to is Mac.

langep commented 5 years ago

@kkm000 If you want me to run a build on Mac let me know.

kkm000 commented 5 years ago

@langep, actually, we've just checked in a fix to broken configure, so I would appreciate if you could test it! I am worrying about the syntax of the code I added, as current Macs use bash 3.2. I tested snippets in a bash 3.2 docker container, but to confirm that the updated configure is digestible on a Mac, and that it does actually work, would be a big deal. Thank you!

The changeset is in #3216, now on master.

langep commented 5 years ago

@kkm000 Do you want me to run a specific configuration or just the default install?

kkm000 commented 5 years ago

@langep, the default would be fine, thanks. The main concern is whether the script won't spit any syntax error or such. On Darwin, Accelerate is currently selected by default anyway, AFAIK.

If there are any errors, please give the output of uname -a and bash --version.

langep commented 5 years ago

@kmm00 I ran the ./configure --shared setting and it compiled without errors.

Kernel version: 18.5.0 Bash version: GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin18)

kkm000 commented 5 years ago

@langep, thanks much for checking!

kkm000 commented 5 years ago

On the topic of CMake vs no CMake. I tried cmake build of OpenBLAS. Their build a warns about CMake support being experimental.

CMake Warning at CMakeLists.txt:46 (message):
  CMake support is experimental.  It does not yet support all build options
  and may not produce the same Makefiles that OpenBLAS ships with.

Fair. Also, keep in mind that I am somewhat biased against it; so far my builds of complex projects using CMake were tremendously hard to track when they ran into problems. It is also very likely that I do not know about its debugging/tracing facilities that could make pinpointing the source of the problems. In other words, it is not impossible that I am shifting the blame on CMake when I should have spent more time learning the tool, so that I could use it to its full potential, not work against it.

So I figured out the build options (cmake -LA[H] prints them, more detailed if with the H), and attempted a build. It did not go well. One thing I immediately stepped on is how unhelphul it is in case of a command line error. Take this:

cmake -DCMAKE_BUILD_TYPE=RELWITHDEBINFO -DUSE_THREAD=0  -DGEMM_MULTITHREAD_THRESHOLD=0 -CMAKE_INSTALL_PREFIX=$(pwd)/install
 . . .
-- Copying LAPACKE header files to include/openblas
-- Found PkgConfig: /usr/bin/pkg-config (found version "0.29.1")
-- Configuring incomplete, errors occurred!
See also "/home/kkm/work/kaldi2/tools/OpenBLAS/CMakeFiles/CMakeOutput.log".
$ wc -l /home/kkm/work/kaldi2/tools/OpenBLAS/CMakeFiles/CMakeOutput.log
414 /home/kkm/work/kaldi2/tools/OpenBLAS/CMakeFiles/CMakeOutput.log

Ehm.. Okay. One of these half a thousand lines is going to tell me something, I thought. Probably something closer to the end? This is how the log ends

Determining if the Fortran compiler supports Fortran 90 passed with the following output:
Change Dir: /home/kkm/work/kaldi2/tools/OpenBLAS/CMakeFiles/CMakeTmp

Run Build Command:"/usr/bin/make" "cmTC_11695/fast"
/usr/bin/make -f CMakeFiles/cmTC_11695.dir/build.make CMakeFiles/cmTC_11695.dir/build
make[1]: Entering directory '/home/kkm/work/kaldi2/tools/OpenBLAS/CMakeFiles/CMakeTmp'
Building Fortran object CMakeFiles/cmTC_11695.dir/testFortranCompilerF90.f90.o
/usr/bin/gfortran    -c /home/kkm/work/kaldi2/tools/OpenBLAS/CMakeFiles/CMakeTmp/testFortranCompilerF90.f90 -o CMakeFiles/cmTC_11695.dir/testFortranCom
Linking Fortran executable cmTC_11695
/usr/bin/cmake -E cmake_link_script CMakeFiles/cmTC_11695.dir/link.txt --verbose=1
/usr/bin/gfortran      CMakeFiles/cmTC_11695.dir/testFortranCompilerF90.f90.o  -o cmTC_11695
make[1]: Leaving directory '/home/kkm/work/kaldi2/tools/OpenBLAS/CMakeFiles/CMakeTmp'

Nothing even closely related to the last successful action ("found PkgConfig") in the screen output.

Now, I do not know if my mistake clearly stands out to someone used to the tool, but I missed the -D in the very last assignment in the command line: -CMAKE_INSTALL_PREFIX=$(pwd)/install should have been -DCMAKE_INSTALL_PREFIX=$(pwd)/install. To be fair. -C is a command line switch, naming a cache file (default CMakeCache.txt. So I looked if it had created a file with a funny name like MAKE_INSTALL_PREFIX=, or maybe a direcrory. as the argument expanded to /home/kkm/work/kaldi2/tools/OpenBLAS/install after the =. Nope. "Configuring incomplete, errors occurred!" was all I got.

After fixing the command line, I got a lot of errors from g++, all of which essentially meant that it was not passed the correct value for the march= switch. The CPU was correctly detected by the build (SkylakeX, the architecture with AVX512 support), but every use of an AVX512 bulitin was yelled at by the compiler. This is where I tried to figure out how the switch gets its value.

I looked at generated Makefiles. I would say that there is no hope for a human to make sense of them. The Makefile just invokes $(MAKE) -f CMakeFiles/Makefile2 xxx for every target, except the helpfully added target named help, which helpfully prints the list of all automatically generated targets:

$ make help | wc -l
13337

But I wanted to figure out where did the wrong -march switch come from. This CMakeFiles/Makefile2 file is more invocations of cmake and make, but finally I could trace it, mostly by grepping, to kernel/CMakeFiles/kernel.dir/build.make which actually did something, actually invoking the compiler

kernel/CMakeFiles/kernel.dir/CMakeFiles/ztrsm_iltucopy.c.o: kernel/CMakeFiles/ztrsm_iltucopy.c
        @$(CMAKE_COMMAND) -E cmake_echo_color --switch=$(COLOR) --green --progress-dir=/home/kkm/work/kaldi2/tools/OpenBLAS/CMakeFiles --progress-num=$(CMAKE_PROGRESS_414) "Building C object kernel/CMakeFiles/kernel.dir/CMakeFiles/ztrsm_iltucopy.c.o"
        cd /home/kkm/work/kaldi2/tools/OpenBLAS/kernel && /usr/bin/cc $(C_DEFINES) $(C_INCLUDES) $(C_FLAGS) -o CMakeFiles/kernel.dir/CMakeFiles/ztrsm_iltucopy.c.o   -c /home/kkm/work/kaldi2/tools/OpenBLAS/kernel/CMakeFiles/ztrsm_iltucopy.c

So, it seems that I had to find where do the wrong C_FLAGS come from. Looks like many subdirectories got a generated own flags.cmake file, where this is defined, and some set of basic flags with no trace of -march. My next take was to find if -march is used anywhere at all. This got me to the cmake/ subdirectory, where apparently most of the build system is scripted. All in all it seems that it is just very incomplete, and is setting -march only once, for the mips64 architecture. Nothing wrong with an early work in progress, it's ok, I just thought at this point how much more will have to be added to this already dense branching. Now, OpeBLAS build is tremendously complex, no question about this. I am not speaking about the size. But compare this to the line in Makefile.system from which the above is apparently being ported from. Seems pretty much, well, same; ifeq in one, STREQUAL in other.

It's possible they may not be using CMake the best possible way, but this does not look like an obvious improvement to me. Also, the warning at the beginning of this file and its entire content also got me a bit worried. With make, at least, Makefile is passive, if it exists in a directory, do not touch it, it does not touch you. But the indication here is that the existence of CMakeLists.txt in subprojects interfered with their build process, so they had to implement this workaround. At this point I thought to myself, well, if we have CMake files in our tools/, and OpenBLAS is in a subdirectory of it, and some other things we build, too, and we also want to build it from the top of our CMakefiles... You see the point. Again, maybe they are just not using it right, but this is something to think about. And if our dependency list quoted in https://github.com/kaldi-asr/kaldi/issues/3086#issuecomment-477456441 is considered unwieldy, then that file (and it is not alone; there is also lapacke.cmake en pendant to it) do not score too high on my wieldity scale either.

Neither did I immediately fall in love with the code in this file: https://github.com/xianyi/OpenBLAS/blob/develop/cmake/utils.cmake. No, gmake is not better for coding code either, even worse, and I did really appreciate the perseverance of the guy who once coded the solver for The Towers Of Hanoi in Postgres SQL, but, I do not know, it just does not look like it was used for what it should have been used. Maybe, again, it's just a bad example, as I just picked a project which just happened to sit there; and if GNU make had a regex match operator, there certainly would be makefiles using it, for the better or, rather likely, for the worse.

As for performance, it's pretty much equal to their standard make-based build, which is rather a pleasant surprise, given the complexity, size and the number of generated makefiles and targets.

Standard build

$ make -j16
. . . .
To install the library, you can run "make PREFIX=/path/to/your/installation install".

real    0m50.506s
user    7m45.755s
sys     1m18.449s

cmake build (with -march=native injected manually)

$ cmake -DCMAKE_C_FLAGS=-march=native -DCMAKE_BUILD_TYPE=RELWITHDEBINFO -DUSE_THREAD=0 -DGEMM_MULTITHREAD_THRESHOLD=0 -DCMAKE_INSTALL_PREFIX=$(pwd)/install
$ make -j 16
 . . .
real    0m46.178s
user    7m57.707s
sys     1m35.371s

(The difference is explained by the fact that the cmake port did not run tests, original build did). So performance should not be a concern.

What I cautiously think I could take home from this experiment. The complex build is complex, but I already noted that, no big news here. There is certainly a learning curve, I cannot estimate how steep; the only problem I traced (missing -march) was not very easy to trace through the generated code, but in the end I could probably fix it without even RTFM by some copypasting without much understanding (we are all quite skillful at that). There is nothing impressive about CMake's syntax, it certainly does not look like a programming language, but (since it also supposed to replace the configure script) it seems to think that it is. Nothing really wrong with the tool itself, except maybe that cryptic flop with the command-line, and the worrisome comments that they had to use a hack to avoid cmake fighting itself. I would not certainly find writing a replacement for the configure script in CMake's language aesthetically pleasant, though--bash seems kind of more natural for scripting (and 75% of our configure is just a fight with ATLAS to make Kaldi complile with it, anyway). So, I dunno, meh?

danpovey commented 5 years ago

You are right about CMake being hard to debug, I have had the same experience. I'm still not really committed to the CMake path, but keeping a somewhat open mind in case @galv generates something nice looking.

On Sun, Apr 14, 2019 at 8:19 AM kkm (aka Kirill Katsnelson) < notifications@github.com> wrote:

On the topic of CMake vs no CMake. I tried cmake build of OpenBLAS. Their build a warns about CMake support being experimental.

CMake Warning at CMakeLists.txt:46 (message): CMake support is experimental. It does not yet support all build options and may not produce the same Makefiles that OpenBLAS ships with.

Fair. Also, keep in mind that I am somewhat biased against it; so far my builds of complex projects using CMake were tremendously hard to track when they ran into problems. It is also very likely that I do not know about its debugging/tracing facilities that could make pinpointing the source of the problems. In other words, it is not impossible that I am shifting the blame on CMake when I should have spent more time learning the tool, so that I could use it to its full potential, not work against it.

So I figured out the build options (cmake -LA[H] prints them, more detailed if with the H), and attempted a build. It did not go well. One thing I immediately stepped on is how unhelphul it is in case of a command line error. Take this:

cmake -DCMAKE_BUILD_TYPE=RELWITHDEBINFO -DUSE_THREAD=0 -DGEMM_MULTITHREAD_THRESHOLD=0 -CMAKE_INSTALL_PREFIX=$(pwd)/install . . . -- Copying LAPACKE header files to include/openblas -- Found PkgConfig: /usr/bin/pkg-config (found version "0.29.1") -- Configuring incomplete, errors occurred! See also "/home/kkm/work/kaldi2/tools/OpenBLAS/CMakeFiles/CMakeOutput.log". $ wc -l /home/kkm/work/kaldi2/tools/OpenBLAS/CMakeFiles/CMakeOutput.log 414 /home/kkm/work/kaldi2/tools/OpenBLAS/CMakeFiles/CMakeOutput.log

Ehm.. Okay. One of these half a thousand lines is going to tell me something, I thought. Probably something closer to the end? This is how the log ends

Determining if the Fortran compiler supports Fortran 90 passed with the following output: Change Dir: /home/kkm/work/kaldi2/tools/OpenBLAS/CMakeFiles/CMakeTmp

Run Build Command:"/usr/bin/make" "cmTC_11695/fast" /usr/bin/make -f CMakeFiles/cmTC_11695.dir/build.make CMakeFiles/cmTC_11695.dir/build make[1]: Entering directory '/home/kkm/work/kaldi2/tools/OpenBLAS/CMakeFiles/CMakeTmp' Building Fortran object CMakeFiles/cmTC_11695.dir/testFortranCompilerF90.f90.o /usr/bin/gfortran -c /home/kkm/work/kaldi2/tools/OpenBLAS/CMakeFiles/CMakeTmp/testFortranCompilerF90.f90 -o CMakeFiles/cmTC_11695.dir/testFortranCom Linking Fortran executable cmTC_11695 /usr/bin/cmake -E cmake_link_script CMakeFiles/cmTC_11695.dir/link.txt --verbose=1 /usr/bin/gfortran CMakeFiles/cmTC_11695.dir/testFortranCompilerF90.f90.o -o cmTC_11695 make[1]: Leaving directory '/home/kkm/work/kaldi2/tools/OpenBLAS/CMakeFiles/CMakeTmp'

Nothing even closely related to the last successful action ("found PkgConfig") in the screen output.

Now, I do not know if my mistake clearly stands out to someone used to the tool, but I missed the -D in the very last assignment in the command line: -CMAKE_INSTALL_PREFIX=$(pwd)/install should have been -DCMAKE_INSTALL_PREFIX=$(pwd)/install. To be fair. -C is a command line switch, naming a cache file (default CMakeCache.txt. So I looked if it had created a file with a funny name like MAKE_INSTALL_PREFIX=, or maybe a direcrory. as the argument expanded to /home/kkm/work/kaldi2/tools/OpenBLAS/install after the =. Nope. "Configuring incomplete, errors occurred!" was all I got.

After fixing the command line, I got a lot of errors from g++, all of which essentially meant that it was not passed the correct value for the march= switch. The CPU was correctly detected by the build (SkylakeX, the architecture with AVX512 support), but every use of an AVX512 bulitin was yelled at by the compiler. This is where I tried to figure out how the switch gets its value.

I looked at generated Makefiles. I would say that there is no hope for a human to make sense of them. The Makefile just invokes $(MAKE) -f CMakeFiles/Makefile2 xxx for every target, except the helpfully added target named help, which helpfully prints the list of all automatically generated targets:

$ make help | wc -l 13337

But I wanted to figure out where did the wrong -march switch come from. This CMakeFiles/Makefile2 file is more invocations of cmake and make, but finally I could trace it, mostly by grepping, to kernel/CMakeFiles/kernel.dir/build.make which actually did something, actually invoking the compiler

kernel/CMakeFiles/kernel.dir/CMakeFiles/ztrsm_iltucopy.c.o: kernel/CMakeFiles/ztrsm_iltucopy.c @$(CMAKE_COMMAND) -E cmake_echo_color --switch=$(COLOR) --green --progress-dir=/home/kkm/work/kaldi2/tools/OpenBLAS/CMakeFiles --progress-num=$(CMAKE_PROGRESS_414) "Building C object kernel/CMakeFiles/kernel.dir/CMakeFiles/ztrsm_iltucopy.c.o" cd /home/kkm/work/kaldi2/tools/OpenBLAS/kernel && /usr/bin/cc $(C_DEFINES) $(C_INCLUDES) $(C_FLAGS) -o CMakeFiles/kernel.dir/CMakeFiles/ztrsm_iltucopy.c.o -c /home/kkm/work/kaldi2/tools/OpenBLAS/kernel/CMakeFiles/ztrsm_iltucopy.c

So, it seems that I had to find where do the wrong C_FLAGS come from. Looks like many subdirectories got a generated own flags.cmake file, where this is defined, and some set of basic flags with no trace of -march. My next take was to find if -march is used anywhere at all. This got me to the cmake/ subdirectory, where apparently most of the build system is scripted. All in all it seems that it is just very incomplete, and is setting -march only once, for the mips64 architecture https://github.com/xianyi/OpenBLAS/blob/develop/cmake/cc.cmake#L28. Nothing wrong with an early work in progress, it's ok, I just thought at this point how much more will have to be added to this already dense branching. Now, OpeBLAS build is tremendously complex, no question about this. I am not speaking about the size. But compare this to the line in Makefile.system from which the above is apparently being ported from https://github.com/xianyi/OpenBLAS/blob/develop/Makefile.system#L618. Seems pretty much, well, same; ifeq in one, STREQUALS in other.

It's possible they may not be using CMake the best possible way, but this does not look like an obvious improvement to me. Also, the warning at the beginning of this file https://github.com/xianyi/OpenBLAS/blob/develop/cmake/lapack.cmake#L1 and its entire content also got me a bit worried. With make, at least, Makefile is passive, if it exists in a directory, do not touch it, it does not touch you. But the indication here is that the existence of CMakeLists.txt in subprojects interfered with their build process, so they had to implement this workaround. At this point I thought to myself, well, if we have CMake files in our tools/, and OpenBLAS is in a subdirectory of it, and some other things we build, too, and we also want to build it from the top of our CMakefiles... You see the point. Again, maybe they are just not using it right, but this is something to think about. And if our dependency list quoted in #3086 (comment) https://github.com/kaldi-asr/kaldi/issues/3086#issuecomment-477456441 is considered unwieldy, then that file, and it is not alone (there is also cmake.lapace https://github.com/xianyi/OpenBLAS/blob/develop/cmake/lapacke.cmake en pendant to it) do not score too high on my wieldity scale either.

Neither did I immediately fall in love with the code in this file: https://github.com/xianyi/OpenBLAS/blob/develop/cmake/utils.cmake. No, gmake is not better for coding code either, even worse, and I did really appreciate the perseverance of the guy who once coded the solver for The Towers Of Hanoi in Postgres SQL, but, I do not know, it just does not look like it was used for what it should have been used. Maybe, again, it's just a bad example, as I just picked a project which just happened to sit there; and if GNU make had a regex match operator, there certainly would be makefiles using it, for the better or, rather likely, for the worse.

As for performance, it's pretty much equal to their standard make-based build, which is rather a pleasant surprise, given the complexity, size and the number of generated makefiles and targets.

Standard build

$ make -j16 . . . . To install the library, you can run "make PREFIX=/path/to/your/installation install".

real 0m50.506s user 7m45.755s sys 1m18.449s

cmake build (with -march=native injected manually)

$ cmake -DCMAKE_C_FLAGS=-march=native -DCMAKE_BUILD_TYPE=RELWITHDEBINFO -DUSE_THREAD=0 -DGEMM_MULTITHREAD_THRESHOLD=0 -DCMAKE_INSTALL_PREFIX=$(pwd)/install $ make -j 16 . . . real 0m46.178s user 7m57.707s sys 1m35.371s

(The difference is explained by the fact that the cmake port did not run tests, original build did). So performance should not be a concern.

What I cautiously think I could take home from this experiment. The complex build is complex, but I already noted that, no big news here. There is certainly a learning curve, I cannot estimate how steep; the only problem I traced (missing -march) was not very easy to trace through the generated code, but in the end I could probably fix it without even RTFM by some copypasting without much understanding (we are all quite skillful at that). There is nothing impressive about CMake's syntax, it certainly does not look like a programming language, but (since it also supposed to replace the configure script) it seems to think that it is. Nothing really wrong with the tool itself, except maybe that cryptic flop with the command-line, and the worrisome comments that they had to use a hack to avoid cmake fighting itself. I would not certainly find writing a replacement for the configure script in CMake's language aesthetically pleasant, though--bash seems kind of more natural for scripting (and 75% of our configure is just a fight with ATLAS to make Kaldi complile with it, anyway). So, I dunno, meh?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/issues/3086#issuecomment-483033996, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJVu3sfitQMD_QP0peS660RE__m5mU1ks5vg3ETgaJpZM4bnduF .

galv commented 5 years ago

To be honest, my impression is that there is no need to use OpenBLAS's cmake build. My impression was added because someone needed to build it on Windows. We already have a script in the tools/ directory.

I don't mean to be dismissive of your long comment, @kkm000, but why does using cmake in kaldi require us to use OpenBLAS's cmake build? This is what I'm doing now: https://github.com/kaldi-asr/kaldi/pull/3100/files#diff-af3b638bc2a3e6c650974192a53c7291R19 It depends on OpenBLAS having already been built.

By the way, you may be interested in trying build OpenBLAS with ninja, if you'd like to make your experiment more complete. In CMake, you'd do this:

cmake -G Ninja -DCMAKE_C_FLAGS=-march=native -DCMAKE_BUILD_TYPE=RELWITHDEBINFO -DUSE_THREAD=0 -DGEMM_MULTITHREAD_THRESHOLD=0 -DCMAKE_INSTALL_PREFIX=$(pwd)/install
cmake --build .

I am not finishing work on #3100 this until I finish migrating hmm-utils.cc to the new non-trainable definition of Topology and Transitions. It's the last non-compiling part of kaldi10 (right now, anyway!), excluding the tensor/ directory. It's just too much hassle to try to migrate the build system when there are non-compiling artifacts and tests are failing. I'm sorry...

kkm000 commented 5 years ago

@galv:

To be honest, my impression is that there is no need to use OpenBLAS's cmake build.

Yes, I noted that. I just was working on build of OpenBLAS (someone was broken on a platform where MKL was unavailable, noticed CMakeFile in that directory and decided to give it a go). So I did not choose a project to play with. A 0-dimensonal sample from an unknown distribution. :)

why does using cmake in kaldi require us to use OpenBLAS's cmake build?

I certainly did not think so, or intended to say that. I was probably not very careful explaining what I did. The extras/Makefile does build OpenBLAS, but it does not have to (and does not) do it using CMake. My comment was only about the OpenBLAS own dealing with a subirectory containing CMakeLists.txt that apparently stood in their way. I do not understand CMake enough to say if it was a real problem or they just did not know how to canonically solve it. I just noted that if/when we use CMake, we'll have extras/CMakeLists and extras/OpenBLAS/CmakeList.txt, the apparently same situation they had to deal with. That what I just wanted someone who really knows the stuff, likely you, to pay attention to. If that's not a problem at all, or their situation is different from ours, great. Think of me a phenologist, I only observe the butterflies and record my field notes, but dissecting them I gladly leave to you! :)) I certainly trust your CMake experience.

I'll try Ninja, too. Again, maybe OpenBLAS is not the kind the project where it would shine, as it's too small, and rebuilds from make clean fully in 45 seconds. No, I mean, if it builds with Ninja in say five seconds instead of 45, I would be really super imressed, but I do not think it really would :))

There is absolutely no rush, please. Keep in mind that I'll be cleaning configure, and it seems there is a lot of simplifications coming. So if starting off it, do not treat it as a golden standard.

By the way, what is the difference between cmake .; cmake --build . and cmake .; make (except the former will probably take into account the -G from the metabuild stage and invoke the matching build command)? This switch is not well documented in the man.

galv commented 5 years ago

cmake --build will build in the current buidl directory, regardless of which build system you are generating files for. It is good for scripting, since you don't need even need to have your build system executable (xcode-build, ninja, make, etc.) to be on your PATH and you don't need to worry in your scripts about which build system you configured with when it comes time to build.

On Sun, Apr 14, 2019 at 9:04 PM kkm (aka Kirill Katsnelson) < notifications@github.com> wrote:

@galv https://github.com/galv:

To be honest, my impression is that there is no need to use OpenBLAS's cmake build.

Yes, I noted that. I just was working on build of OpenBLAS (someone was broken on a platform where MKL was unavailable, noticed CMakeFile in that directory and decided to give it a go). So I did not choose a project to play with. 0-dimensonal sample from an unknown distribution. :)

why does using cmake in kaldi require us to use OpenBLAS's cmake build?

I certainly did not think so, or intended to say that. I was probably not very careful explaining what I did. The extras/Makefile does build OpenBLAS, but it does not have to (and does not) do it using CMake. My comment was only about the OpenBLAS own dealing with a subirectory containing CMakeLists.txt that apparently stood in their way. I do not understand CMake enough to say if it was a real problem or they just did not know how to canonically solve it. I just noted that if/when we use CMake, we'll have extras/CMakeLists and extras/OpenBLAS/CmakeList.txt, the apparently same situation they had to deal with. That what I just wanted someone who really knows the stuff, likely you, to pay attention to. If that's not a problem at all, or their situation is different from ours, great. Think of me a phenologist, I only observe the butterflies and record my field notes, but dissecting them I gladly leave to you! :)) I certainly trust your CMake experience.

I'll try Ninja, too. Again, maybe OpenBLAS is not the kind the project where it would shine, as it's too small, and rebuilds from make clean fully in 45 seconds. No, I mean, if it builds with Ninja in say five seconds instead of 45, I would be really super imressed, but I do not think it really would :))

There is absolutely no rush, please. Keep in mind that I'll be cleaning configure, and it seems there is a lot of simplifications coming. So if starting off it, do not treat it as a golden standard.

By the way, what is the difference between cmake .; cmake --build . and cmake .; make (except the former will probably take into account the -G from the metabuild stage and invoke the matching build command)? This switch is not well documented in the man.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/issues/3086#issuecomment-483100314, or mute the thread https://github.com/notifications/unsubscribe-auth/AEi_UL7n4gFaOrVCdj3CSB1uWs6W--cDks5vg_pDgaJpZM4bnduF .

-- Daniel Galvez http://danielgalvez.me https://github.com/galv

kkm000 commented 5 years ago

I see, thanks. I usually also pass -j16 to make and a couple of other switches, but I can just invoke make, no big deal. cmake --build . invoked make in sequential mode for me. Maybe I need more -D... magic in the command line for that, or some .cmakerc or how it is configured with user preferences (I really must RTFM, but do not have time now).

I guess CMake should also support something like that out of the box, i. e. run the build system in the most sensible parallel mode? I'm using -j16 for Kaldi as I have 16 physical cores (and 16 logical cores, HT off).

Just make -j, which stands for unlimited parallelism, I suppose, overwhelms the system. Memory is strained but not exhausted, but it's mostly the context switching which makes it slower, I suppose, as I compile with -O2 and that's CPU-intensive. I do not have a swapfile, so this cannot possibly be thrashing, just the processes fighting to be scheduled.

cloudhan commented 5 years ago

I usually also pass -j16 to make and a couple of other switches, but I can just invoke make, no big deal. cmake --build . invoked make in sequential mode for me. Maybe I need more -D... magic in the command line for that,

with cmake --build

cmake --build . --target install -- -j16  # for Makefile build system
# or
cmake --build . --target install -- /m:16 # for MSBuild

complete build and install command should be

cmake .. -G <...> -DCMAKE_INSTALL_PREFIX=... -D... # configure
cmake --build . --target install

that should be all commands the users need to invoke for a well organized cmake project

kkm000 commented 5 years ago

@cloudhan:

cmake --build . --target install -- -j16 # for Makefile build system

Thanks. I assume make -j16 would still save me a few keystrokes tho :)

galv commented 5 years ago

@cloudhan thanks

On Sun, Apr 14, 2019 at 11:19 PM kkm (aka Kirill Katsnelson) < notifications@github.com> wrote:

@cloudhan https://github.com/cloudhan:

cmake --build . --target install -- -j16 # for Makefile build system

Thanks. I assume make -j16 would still save me a few keystrokes tho :)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/issues/3086#issuecomment-483121770, or mute the thread https://github.com/notifications/unsubscribe-auth/AEi_UB-Ey4Re4EKWRF40_xEsT4ob1oudks5vhBnmgaJpZM4bnduF .

-- Daniel Galvez http://danielgalvez.me https://github.com/galv

langep commented 5 years ago

@galv what is the current state of cmake support for kaldi? is this still being worked on?

galv commented 5 years ago

I'm not actively working on it. I stopped because, at the time, kaldi10's hmm and tree subprojects were not compiling. I tried to get those working, but it was rather involve, and I didn't complete it. However, kaldi10 is now compiling, so I could certainly pick this back up, but it's not a great priority for me until I have a free moment away from my regular job (which could be this weekend, but who knows?)

On Mon, Aug 19, 2019 at 5:59 PM Patrick L. Lange notifications@github.com wrote:

@galv https://github.com/galv what is the current state of cmake support for kaldi? is this still being worked on?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/issues/3086?email_source=notifications&email_token=ABEL6UCWIAHULZADFYYMXVDQFM6W5A5CNFSM4G453OC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4UXB3A#issuecomment-522809580, or mute the thread https://github.com/notifications/unsubscribe-auth/ABEL6UAYVOWF4UWYBKBCJADQFM6W5ANCNFSM4G453OCQ .

-- Daniel Galvez http://danielgalvez.me https://github.com/galv

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

davidlin409 commented 3 years ago

For CMake building, existing problem is that CMake build can not be run successfully using shared library. After some investigation, I found some bugs in CMake build script that needs to be fixed. After the fixing, now I can successfully install Kaldi using CMake project, while THCHS-30 training/decoding/alignment are working perfectly.

Also for OpenFST, python script is added to install OpenFST to designated location.

If it is needed, I might be able to patch cmake fix here, if the changes are adequate. My changes are at

https://github.com/davidlin409/kaldi

Start point of the change is at label "status/start_point".

danpovey commented 3 years ago

It would be great if you could make a PR with those changes so we can more easily see the diff.

On Fri, Feb 19, 2021 at 3:36 PM davidlin409 notifications@github.com wrote:

For CMake building, existing problem is that CMake build can not be run successfully using shared library. After some investigation, I found some bugs in CMake build script that needs to be fixed. After the fixing, now I can successfully install Kaldi using CMake project, while THCHS-30 training/decoding/alignment are working perfectly.

Also for OpenFST, python script is added to install OpenFST to designated location.

If it is needed, I might be able to patch cmake fix here, if the changes are adequate. My changes are at

https://github.com/davidlin409/kaldi

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/issues/3086#issuecomment-781893560, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZFLOZ4BPQLAECHQHWMO7LS7YIIVANCNFSM4G453OCQ .