amkozlov / raxml-ng

RAxML Next Generation: faster, easier-to-use and more flexible
GNU Affero General Public License v3.0
376 stars 62 forks source link

building in parallel fails for large number of cores #108

Closed boegel closed 3 years ago

boegel commented 3 years ago

I ran into a build failure that took me a while to figure out:

$ make -j 96
make[2]: Leaving directory '/tmp//RAxMLNG/0.9.0/GCC-8.3.0/easybuild_obj'
make[1]: *** [CMakeFiles/Makefile2:822: test/src/CMakeFiles/raxml_test_module.dir/all] Error 2                                                                                                             
make[1]: *** Waiting for unfinished jobs....                                                                                                                                                                
make[2]: Leaving directory '/tmp//RAxMLNG/0.9.0/GCC-8.3.0/easybuild_obj'                                                                                                            
make[1]: *** [CMakeFiles/Makefile2:771: src/CMakeFiles/raxml_module.dir/all] Error 2                                                                                                                        
make[1]: Leaving directory '/tmp//RAxMLNG/0.9.0/GCC-8.3.0/easybuild_obj'
make: *** [Makefile:144: all] Error 2

This was on our dual-docket 48-core AMD Epyc systems (our installation tool automatically uses all cores it sees).

At first sight, there was no actual error higher up, until I noticed this (which is easy to overlook when looking for the usual Error 1 or error: message):

make[2]: *** No rule to make target 'localdeps/lib/libterraces.a', needed by '/tmp/RAxMLNG/0.9.0/GCC-8.3.0/raxml-ng-0.9.0/bin/raxml-ng'.  Stop.
make[2]: *** Waiting for unfinished jobs....                                                                                                                                                                
make[2]: *** No rule to make target 'localdeps/lib/libterraces.a', needed by '/tmp/RAxMLNG/0.9.0/GCC-8.3.0/raxml-ng-0.9.0/test/bin/raxml_test'.  Stop.
make[2]: *** Waiting for unfinished jobs....

Looks like an missing dependency declaration in the Makefile?

Running with less cores, like make -j 10, works fine.

amkozlov commented 3 years ago

This should be fixed already: https://github.com/amkozlov/raxml-ng/issues/77

Could you please try to reproduce with the latest version?

amkozlov commented 3 years ago

@boegel closing for now, please feel free to reopen if this is still failing in 1.0.1.

boegel commented 3 years ago

The build using make -j 96 still fails for me, but only for test/bin/raxml_test (no longer for bin/raxml-ng):

make[2]: *** No rule to make target 'localdeps/lib/libterraces.a', needed by '/tmp/raxml-ng-1.0.1/test/bin/raxml_test'.  Stop.
boegel commented 3 years ago

@amkozlov I can't reopen this issue myself, but the problem persists...

amkozlov commented 3 years ago

@boegel oh sorry I forgot to update CMakeLists for the tests, this should be fixed now. Thanks for reporting!