Rdatatable / data.table

R's data.table package extends data.frame:
http://r-datatable.com
Mozilla Public License 2.0
3.62k stars 985 forks source link

Using gcc on macOS as the default compiler? #4016

Closed pat-s closed 3 years ago

pat-s commented 5 years ago

After reading and trying the compiler instructions for macOS, I was wondering if gcc from homebrew would be a better alternative than llvm.

As stated in the wiki, an openmp enabled compilers causes issues with other packages. It is quite cumbersome to comment out the llvm compiler all the time if one wants to compile other packages.

Currently I am using the following setup with gcc installed via brew install gcc. No issues so far for any packages.

CXX_STD = CXX14

CC=ccache /usr/local/bin/gcc-9
CC11=ccache /usr/local/bin/gcc-9
CC14=ccache /usr/local/bin/gcc-9
CXX=ccache /usr/local/bin/g++-9
CXX11=ccache /usr/local/bin/g++-9
CXX14=ccache /usr/local/bin/g++-9
## -O3 should be faster than -O2 (default) level optimisation ..
CFLAGS=-g -O3 -Wall -pedantic -std=gnu99 -mtune=native -pipe
CXXFLAGS=-g -O3 -Wall -pedantic -std=c++11 -mtune=native -pipe

In addition, I am using ccache as you can see. Combined with the ~/.ccache/ccache.conf settings below, I have a robust C compiler setup which supports caching.

max_size = 5.0G
# important for R CMD INSTALL *.tar.gz as tarballs are expanded freshly -> fresh ctime
sloppiness = include_file_ctime
# also important as the (temp.) directory name will differ
hash_dir = false

(The ccache part is taken from this post by Dirk.)

renkun-ken commented 5 years ago

I'm using the approach suggested at https://github.com/rmacoslib/r-macos-rtools. I use the clang7 provided at https://cloud.r-project.org/bin/macosx/tools and only have to specify the following in ~/.R/Makevars:

CC=/usr/local/clang7/bin/clang
CXX=/usr/local/clang7/bin/clang++
CXX1X=/usr/local/clang7/bin/clang++
CXX11=/usr/local/clang7/bin/clang++
CXX14=/usr/local/clang7/bin/clang++
CXX17=/usr/local/clang7/bin/clang++
LDFLAGS=-L/usr/local/clang7/lib

And everything works smoothly and consistently.

I'm not sure why the macOS instructions look so complicated, which will easily make beginner users who use macOS have a hard time even installing latest data.table from source.

pat-s commented 4 years ago

For me, the gcc approach is the fastest and most stable one so far. I am not sure though what other caveats this might introduce on macOS. 🤔

@jangorecki Who is in charge for the C-related part in {data.table} or might have more in-depth knowledge about what is going on behind the scenes?

gcc is the compiler used on all major Linux distros, is there anything preventing it from setting it as the default on macOS?

jangorecki commented 4 years ago

There is no single person in charge of C part of DT. Matt and Arun wrote most of C stuff. AFAIR @arunsrinivasan is on macOS so his views on this issue could be very helpful.

jangorecki commented 3 years ago

I prefer to use gcc myself as well, but it happens that clang is the default one on MacOS. Trying to change default compiler globally for Mac users would causes more issues than it resolves. So I don't think changing default is the way to go. Now that https://github.com/Rdatatable/data.table/pull/4735 is merged (and already published to CRAN as 1.13.2), users should be able to more easily customize their build of data.table. Therefore it should address your use case. If it doesn't please let us know.

mattdowle commented 3 years ago

Hi @pat-s. Thanks for your input here. I would just add that if the MacOS instructions on our Installation page can be simplified, that would be very welcome. Please go ahead and make the changes directly yourself to the wiki. There are no permissions needed to change the wiki, which is the very reason we made it a wiki.