microsoft / SynapseML

Simple and Distributed Machine Learning
http://aka.ms/spark
MIT License
5.06k stars 833 forks source link

Running mmlspark on Centos 7 #859

Open davinro opened 4 years ago

davinro commented 4 years ago

Is your feature request related to a problem? Please describe.

My data nodes are running Centos 7. Though that operating system is not supported by mmlspark, here is what I did to make the examples for 1.0.0-rc1 run.

Describe the solution you'd like

  1. Download the source code for gcc-5.5.0 and openmpi-1.10.7 onto a data node.
  2. Build and install the libraries. It will likely be necessary to install development tools to do so.
  3. Copy the binaries to all data nodes, making sure not to overwrite any versions that the operating system is currently using. It is not necessary to cause the operating system to use the new libstdc++.so, etc.
  4. In spark-defaults.conf add the line spark.executor.extraLibraryPath /usr/local/lib64:/usr/lib64/openmpi/ or wherever the libraries were actually copied to.

Additional context

Use caution in deploying the gcc-5.5.0 libraries, as overwriting libraries currently in use by the operating system may make it unbootable.

welcome[bot] commented 4 years ago

👋 Thanks for opening your first issue here! If you're reporting a 🐞 bug, please make sure you include steps to reproduce it.