Compiling HOOMD-blue 2.1.6 from source

joaander commented 7 years ago

Original report by Adam J Stewart (Bitbucket: adamjstewart, GitHub: adamjstewart).

I'm trying to compile the latest HOOMD-blue release but it crashes with the following error message:

-- Updating git submodules
fatal: Not a git repository (or any of the parent directories): .git
CMake Error at CMakeLists.txt:68 (message):
  Libgetar was not found in hoomd/extern/libgetar.  Please pull the libgetar
  source, i.e.  via `git submodule update`.

It looks like the latest tarball does not come with a .git directory. I've tried installing libgetar externally with both:

$ python setup.py install

and

$ cmake .
$ make
$ make install

and setting libgetar_DIR to the installation directory, but that doesn't seem to work. Any suggestions?

joaander commented 7 years ago

Original comment by Matthew Spellings (Bitbucket: mspells, GitHub: klarh).

Since we added several external packages as git submodules, downloading source tarballs doesn't give you all of the paths in the right places. If you check out the source code from the git repository as in http://hoomd-blue.readthedocs.io/en/stable/compiling.html#compile-hoomd-blue , that should fix this problem for you.

joaander commented 7 years ago

To add to what Matthew said: the documentation explicitly mentions that source must be checked out with git (http://hoomd-blue.readthedocs.io/en/stable/compiling.html#compile-hoomd-blue) . Same goes for the web site (http://glotzerlab.engin.umich.edu/hoomd-blue/download.html).

There are no links to the tarball downloads anywhere except those automatically generated by bitbucket, and I have no control to remove those. Bitbucket chooses not to put git submodules in their tarballs.

joaander commented 7 years ago

Original comment by Adam J Stewart (Bitbucket: adamjstewart, GitHub: adamjstewart).

Just to give some more background, I'm trying to add a newer version of HOOMD-blue to our package manager: https://github.com/LLNL/spack/blob/develop/var/spack/repos/builtin/packages/hoomd-blue/package.py

I was hoping to install something a little more stable than "whatever happened to be committed yesterday" and so I looked for the latest release. I guess I could check out the specific commit that corresponds to the latest release, but that begs the question: Why do you have releases at all if you know they don't work? I understand the desire to vendor your dependencies as git submodules, but I'm surprised that you can't use externally installed installations instead. Does the conda package also install a random commit off of the master branch or does it use the stable release tarballs?

joaander commented 7 years ago

Original comment by Mike Henry (Bitbucket: mikemhenry, GitHub: mikemhenry).

I think you are looking for the tagged https://confluence.atlassian.com/bitbucket/use-repository-tags-321860179.html releases. The conda releases correspond to the same tag.

The optimal way to compile hoomd is to clone the repo, then checkout the commit you want to build from the tag. Something like:

#!bash

git checkout tags/v2.1.6

If git is not available on the cluster, clone the repo on a computer that can use git (make sure to get the sub modules with the --recursive option), checkout the release you want, then create the tarball.

If spack supports using git, I would recommend using git. The tarballs listed on bitbucket are not something maintained by the hoomd developers.

joaander commented 7 years ago

I never recommend that general hoomd users build random commits of master (that doesn't stop people from doing it, though). This is why I have maint set as the default branch - maint only includes bugfixes from the last feature release.

I tag releases 1) So that there is a clear change log from release to release. 2) There are well defined releases that the community can trust as well tested. 3) Releases are numbered with semantic versioning so it is clear what releases fix bugs and what releases introduce new features. 4) So that the community can report bugs found in a specific release.

I never knowingly tag a release that I know does not work. The fact that bitbucket offers download links to tags that do not work when there are submodules present is completely out of my control. If I could disable those download links, I would - there is a steady stream of users that download those files and attempt to use them. There should be a single source of truth to get hoomd, the git repository itself - maintaining multiple sources of truth is always too time consuming.

The conda recipes that I maintain always build tagged releases. The specific tag to build is encoded in the recipe's meta.yaml following standard community practices. Gentoo, pacman, and other packaging tools I am familiar with also follow this convention. I am not familiar with SPACK, so I cannot comment on that tool specifically or what standard community practices are for packages in it.

As far as I know, the vast majority of hoomd users fall into one of two categories: They either install and use the conda packages, or they build from source by hand (or use helper scripts to automate the build). Our current use of submodules is convenient for these use-cases. conda and pacman package builders both handle submodules fine. Those few users I know of outside this realm (i.e. the gentoo package builder) have all handled the situation by their own methods. For example, the gentoo package explicitly downloads and unpacks libgetar using multiple sources:

    inherit vcs-snapshot
    GETTAR_VER=0.5.0
    SRC_URI="https://bitbucket.org/glotzer/${PN}/get/v${PV}.tar.bz2 -> ${P}.tar.bz2
        https://bitbucket.org/glotzer/libgetar/get/v${GETTAR_VER}.tar.bz2 -> libgetar-${GETTAR_VER}.tar.bz2"

...

mv ../libgetar-${GETTAR_VER}/* hoomd/extern/libgetar || die

If you desire a different use-case for finding the submodule provided dependencies, then that is a feature request and not a bug. cereal, cub, nano-signal-slot, pybind, and upp11 are all header-file only libraries. Only a handful of lines of cmake code are needed in hoomd to allow them to be found as external dependencies (and upp11 is only needed for unit tests). libgetar might take a bit more effort.

I am in favor of improving HOOMD's cmake scripts so that these dependencies may be made external. However, I have no time to implement it myself in the foreseeable future (HOOMD is a community code, I accept pull requests). The natural thing to do is to add a USE_EMBEDDEDLIBS option that defaults to ON (the current behavior). When it is off, execute the necessary find* commands to find paths where the different header-only libs are located and add them to the include_directory list. Similarly, the way libgetar is linked in to hoomd.so can change based on the value of USE_EMBEDDED_LIBS.

Opening up methods for users to provide their own dependencies means we will also need to add validation checks to ensure that the versions provided are suitable (e.g., pybind11 2.x is not yet supported - and when it is, 1.x will no longer be supported due to API differences). Submodules are tagged to specific commits and currently provide this validation implicitly.

joaander commented 7 years ago

Original comment by Matthew Spellings (Bitbucket: mspells, GitHub: klarh).

To corroborate what Mike said, in our internal package management scripts we also just checkout particular tagged versions using git.

Something that would probably not be difficult to do is to add a script that sets up the git submodules, so that tarball releases would just need one extra command to run that script before the rest of hoomd setup. This doesn't solve any problems for systems that don't have git, though, and would probably be easy to forget to update when submodule versions are updated.

joaander commented 7 years ago

There are also scripts that can export with submodules (http://stackoverflow.com/questions/14783127/git-archive-export-with-submodules-git-archive-all-recursive). However, using them adds extra steps to the release cycle, does nothing to fix the tarballs that bitbucket provides, and adds a yet another source of truth - so I am not particularly inclined to go that route.

joaander commented 7 years ago

Original comment by Adam J Stewart (Bitbucket: adamjstewart, GitHub: adamjstewart).

Thanks everyone! I think I'll go with cloning the repository and checking out the latest tagged release. Spack supports that use case fairly well: http://spack.readthedocs.io/en/latest/packaging_guide.html#git

At this point I've run into a new problem. I'm building my own CUDA, and unfortunately NVIDIA doesn't care about its users enough to support any compiler newer than GCC 4.4.7 on CentOS 6: http://docs.nvidia.com/cuda/cuda-installation-guide-linux/#system-requirements

The problem is that HOOMD-blue doesn't seem to build with GCC 4.4.7:

cc1plus: error: unrecognized command line option "-std=c++11"

This is really NVIDIA's fault, not your own, but it does make for an interesting problem for a package manager like Spack. Any suggestions as to how I can work around this for now?

joaander commented 7 years ago

Original comment by Adam J Stewart (Bitbucket: adamjstewart, GitHub: adamjstewart).

It looks like GCC added experimental support for C++11 features in GCC 4.3+ with the -std=c++0x flag, and stable support in GCC 4.7+ with the -std=c++11 flag: https://gcc.gnu.org/projects/cxx-status.html

Perhaps your CMake build scripts could be made to detect the GCC version and use -std=c++0x or -std=c++11 accordingly?

joaander commented 7 years ago

What errors are you running into building hoomd on Centos6 with a newer gcc and CUDA? I use this configuration to build the conda packages - a Centos6 VM, gcc 4.8.5 (provided by conda package prereq), and CUDA 7.5. I didn't need to do anything special to get this to work. I haven't tested anything newer than gcc 4.9, though.

nvcc used to complain if you used a newer gcc in line with those requirements you link to, but I haven't observed it recently. At one point, I was overriding and using the system gcc for nvcc, but the newer gcc for building CPU code. That doesn't work now because of the C++11 requirement.

joaander commented 7 years ago

There is only one line (75 in HOOMDCFlagsSetup.cmake) where c++11 is set. It should be possible to put that in an if block conditional on the gcc version.

joaander commented 7 years ago

Original comment by Adam J Stewart (Bitbucket: adamjstewart, GitHub: adamjstewart).

Hmm, let me try building a newer version of GCC 4 and see if CUDA still complains. Thanks for the suggestion!

joaander commented 7 years ago

Original comment by Adam J Stewart (Bitbucket: adamjstewart, GitHub: adamjstewart).

Alright, CUDA seems to be happy now. Although it might be secretly using my system compilers anyway.

Now I'm trying to install HOOMD-blue with GCC 6.1.0, but it's crashing at 95% of the way through:

/blues/gpfs/home/ajstewart/spack/var/spack/stage/hoomd-blue-2.1.6-ueka4rynb5yfabpcaesqw4jmq62ycxzq/hoomd-blue/hoomd/hpmc/test/test_simple_polygon.cc:473:14: error: exponent has no digits
     r_ij.x = 0x1.27978ff361599p+0;
              ^~~~~~~~~~~~~~~~~~

Any idea what might be causing this error message? This seems to be the only file with problems.

joaander commented 7 years ago

Original comment by Adam J Stewart (Bitbucket: adamjstewart, GitHub: adamjstewart).

Reproduced the same problem in maint.

joaander commented 7 years ago

This format of floats is apparently technically not supported in c++11 (https://bugzilla.redhat.com/show_bug.cgi?id=1321986) - and gcc 6 is just being more strict than earlier versions.

Since this is in a unit test and you are attempting to build an installed package you can build with -DBUILD_TESTING=OFF to prevent the tests from building.

joaander commented 7 years ago

Original comment by Adam J Stewart (Bitbucket: adamjstewart, GitHub: adamjstewart).

Ah, I see. I was hoping to get make test to pass though. If you can provide me with a patch that converts the hexadecimal floats to a valid format, I can add that to the Spack package. Otherwise, maybe you should use -std=gnu++11 instead of -std=c++11.

joaander commented 7 years ago

Thanks for the suggestion, but gnu++11 is not an option. We also support clang and potentially intel compilers as well.

I opened issue #239 to fix the float specification. It is a non-trivial update to just "provide you a patch". To officially support compilation with gcc 6.x, I need to build a local environment with gcc 6, add it to the nightly unit test rig, make and the necessary changes to the source code to get it to compile. I'm not aware of any mainstream linux distros running gcc 6 by default, nor am I aware of any supercomputing centers that have gcc 6 as a default gcc compiler. So the number of users impacted by this bug is very small.

joaander commented 7 years ago

Original comment by Adam J Stewart (Bitbucket: adamjstewart, GitHub: adamjstewart).

I'm not aware of any mainstream linux distros running gcc 6 by default

Fedora 25 comes with GCC 6.3.1.

nor am I aware of any supercomputing centers that have gcc 6 as a default gcc compiler

Not the default, just trying to use the latest and greatest. I'll try building everything with GCC 5.3.0 and see if that works. Hopefully a future release will have full GCC 6 support.

glotzerlab / hoomd-blue

Compiling HOOMD-blue 2.1.6 from source #238