Closed riga closed 7 years ago
A new Issue was created by @riga Marcel R..
@davidlange6, @Dr15Jones, @smuzaffar can you please review it and eventually sign/assign? Thanks.
cms-bot commands are listed here
I don't think we ever expected people to build Python packages at low-level within CMSSW. This is way our toolfiles only contain information only needed to use Python packages.
I would probably suggest adding another tool into py2-numpy-toolfile.spec
maybe py2-numpy-devel.xml
or py2-numpy-c-api.xml
, which would define libraries and include directories. That one could be used for building packages which depend on these low-level APIs.
@davidlange6 I think, you were working on C++ API for tensorflow.
On May 2, 2017, at 3:19 PM, davidlt notifications@github.com wrote:
I don't think we ever expected people to build Python packages at low-level within CMSSW. This is way our toolfiles only contain information only needed to use Python packages.
I would probably suggest adding another tool into py2-numpy-toolfile.spec maybe py2-numpy-devel.xml or py2-numpy-c-api.xml, which would define libraries and include directories. That one could be used for building packages which depend on these low-level APIs.
@davidlange6 I think, you were working on C++ API for tensorflow.
not me, though its on my to do list - @mharrend was (and succeeded I believe). Have a look here: https://github.com/cms-sw/cmsdist/pull/2824
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
My guess is py2-tensorflow.spec
contains tensorflow + Python bindings. Do you know why we didn't build tensorflow.spec
and then py2-tensorflow.spec
(only Python bindings)?
tensorflow is coming via a wheel, so indeed, maybe the c++ bindings are there too and I just didn't manage to access them..
On May 2, 2017, at 3:41 PM, davidlt notifications@github.com wrote:
My guess is py2-tensorflow.spec contains tensorflow + Python bindings. Do you know why we didn't build tensorflow.spec and then py2-tensorflow.spec (only Python bindings)?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
Thanks for your comments!
I would probably suggest adding another tool into py2-numpy-toolfile.spec maybe py2-numpy-devel.xml or py2-numpy-c-api.xml, which would define libraries and include directories. That one could be used for building packages which depend on these low-level APIs.
@davidlt Sounds good. Just tell me If I can provide or test something, I'm glad to help.
not me, though its on my to do list - @mharrend was (and succeeded I believe). Have a look here:
2824
@davidlange6 Yep, I'm currently using his bundle for 80X ;) The wheel comes with C++ headers and libraries, however, I haven't yet managed to access them via editing my BuildFile. But this should be possible via a tool file, right?
I looked at https://www.tensorflow.org/api_docs/cc/class/tensorflow/ops/stage and tried to check for headers -- found none to be part of our py2-tensorflow package. Could you double check?
Btw, could you provide some information why you need C or C++ API (e.g. some run-time integration for CMSSW) or just because you want to use C++?
Yeah, apparently model building is not yet supported by the C++ API, which is why there are no headers containing the ops. As far as I see, the graph evaluation should already work in C++, which requires the headers in tensorflow/include/tensorflow/core/public
, e.g. session.h
.
On analysis level and for network training we use the Python API, and in general, there is no reason not to ;) But for model evaluation on an event-by-event basis right within a CMSSW module (e.g. for electron MVAs, b-tagging, etc), the C++ API would be very helpful.
That's exactly what I wanted to hear. If we reached the point where we want to integrate model evaluation run-time into CMSSW that needs C or/and C++ API.
For that we would need to build a tensorflow as a normal package inside CMSSW.
For that we would need to build a tensorflow as a normal package inside CMSSW.
That would be awesome. I know many groups within CMS that would really appreciate a working TF interface (including us, of course).
In the meantime, in combination with the approach mentioned above, this additional py2-numpy-devel tool file would already do the trick.
The following should work for you: https://github.com/cms-sw/cmsdist/pull/2998
IIRC, there are no explicit libraries we need to link to.
Thanks @davidlt for providing this so fast!
I'm not an expert...how can I use that file? Or will it be available at some point in the scram tool list
?
Once merged it will be available in IBs, which are built twice a day. If you want to use it now:
py2-numpy-c-api.xml
<tool name="py2-numpy-c-api" version="1.12.1-cms3">
<client>
<environment name="PY2_NUMPY_C_API_BASE" default="/mnt/build/davidlt/numpy/a/slc6_amd64_gcc530/external/py2-numpy/1.12.1-cms3"/>
<environment name="INCLUDE" default="$PY2_NUMPY_C_API_BASE/lib/python2.7/site-packages/numpy-1.12.1-py2.7-linux-x86_64.egg/numpy/core/include"/>
</client>
<use name="python"/>
</tool>
Note, you probably will need to adjust the paths.
scram setup /path/to/py2-numpy-c-api.xml
+ cmsenv
You can now use py2-numpy-c-api
in BuildFile.xml
.
Works like a charm, thanks!
I was thinking about tensorflow integration into CMSSW. I think, we could do a dirty and quick integration similar to what we do for Oracle Instant Client and CUDA.
We can grab official binary builds and put that into CMSSW.
$ tar xvzf libtensorflow-cpu-linux-x86_64-1.1.0.tar.gz
./
./include/
./include/tensorflow/
./include/tensorflow/c/
./include/tensorflow/c/c_api.h
./lib/
./lib/libtensorflow.so
./include/tensorflow/c/LICENSE
$ ldd ./lib/libtensorflow.so
linux-vdso.so.1 (0x00007fff0f9ca000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f489ea2e000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f489e82a000)
libm.so.6 => /lib64/libm.so.6 (0x00007f489e521000)
libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f489e199000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f489df82000)
libc.so.6 => /lib64/libc.so.6 (0x00007f489dbba000)
/lib64/ld-linux-x86-64.so.2 (0x000055dd55c99000)
This would provide you C API, but not C++.
The only issue I see is that it's a fat binary (incl. all needed externals at specific version), thus 90MB. Symbols are not hidden/internal, e.g. from jemalloc: je_arena_cleanup. Only C (and later C++ API) should be globally visible, the rest should be hidden.
I looked into tensorflow repository and found 2 issues already created in the last 10 days about symbol visibility. This is actually causing issues in real life. They think it's easy to resolve, but someone needs to do it.
I also asked about binary distributions for C++ API, which they don't provide at this point.
The only issue could be that C++11 ABI.
Hi @davidlt - This seems ok until we have a tensorflow tool inside a bigger application (which will be one week after having a tensorflow tool in cmssw). or is it easy for us to hide these symbols?
as I mentioned to you yesterday I was looking at how to plug in our own externals, but that is so far non trivial.
On May 3, 2017, at 11:15 PM, davidlt notifications@github.com wrote:
I was thinking about tensorflow integration into CMSSW. I think, we could do a dirty and quick integration similar to what we do for Oracle Instant Client and CUDA.
We can grab official binary builds and put that into CMSSW.
$ tar xvzf libtensorflow-cpu-linux-x86_64-1.1.0.tar.gz ./ ./include/ ./include/tensorflow/ ./include/tensorflow/c/ ./include/tensorflow/c/c_api.h ./lib/ ./lib/libtensorflow.so ./include/tensorflow/c/LICENSE $ ldd ./lib/libtensorflow.so linux-vdso.so.1 (0x00007fff0f9ca000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f489ea2e000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f489e82a000) libm.so.6 => /lib64/libm.so.6 (0x00007f489e521000) libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f489e199000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f489df82000) libc.so.6 => /lib64/libc.so.6 (0x00007f489dbba000) /lib64/ld-linux-x86-64.so.2 (0x000055dd55c99000)
This would provide you C API, but not C++.
The only issue I see is that it's a fat binary (incl. all needed externals at specific version), thus 90MB. Symbols are not hidden/internal, e.g. from jemalloc: je_arena_cleanup. Only C (and later C++ API) should be globally visible, the rest should be hidden.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
It is an easy one and someone is already working on this (seems unsuccessfully for now). There are a couple of ways to do it. The easiest one and it's good for C libraries are symbol version scripts. That's what is being worked on. This also brings ability to add version symbols to tensorflow API.
It works less good in C++ application where compiler can generate gazillions of symbols (e.g. via templates). For that it would be better to use explicit attributes in code.
I think, it's much more easier to work with upstream community / Google to ensure that binary distributions are properly built and just take them.
IIUC, you don't want to use Eigen, jemalloc, etc from CMSSW while building tensorflow as it requires particular versions of these packages. It seems to be tightly integrated. Thus inside SPEC file you would be building the same binary distribution as you could get from upstream community / Google.
That binary distributions also by default works for everything (CMS, ATLAS, Ubuntu, etc.)
I found tensorflow SPEC in copr repository (third-party, random user): http://copr-dist-git.fedorainfracloud.org/cgit/alonid/tensorflow/tensorflow.git/tree/tensorflow.spec?id=89062de403c2264c08c3559597a2dd878516e456
On May 4, 2017, at 1:18 PM, davidlt notifications@github.com wrote:
It is an easy one and someone is already working on this (seems unsuccessfully for now). There are a couple of ways to do it. The easiest one and it's good for C libraries are symbol version scripts. That's what is being worked on. This also brings ability to add version symbols to tensorflow API.
It works less good in C++ application where compiler can generate gazillions of symbols (e.g. via templates). For that it would be better to use explicit attributes in code.
I think, it's much more easier to work with upstream community / Google to ensure that binary distributions are properly built and just take them.
IIUC, you don't want to use Eigen, jemalloc, etc from CMSSW while building tensorflow as it requires particular versions of these packages. It seems to be tightly integrated. Thus inside SPEC file you would be building the same binary distribution as you could get from upstream community / Google.
That binary distributions also by default works for everything (CMS, ATLAS, Ubuntu, etc.)
I found tensorflow SPEC in copr repository (third-party, random user): http://copr-dist-git.fedorainfracloud.org/cgit/alonid/tensorflow/tensorflow.git/tree/tensorflow.spec?id=89062de403c2264c08c3559597a2dd878516e456
Thanks - I have basically this same recipe aside from having to change where tmp files go (defaulting to afs where quota fills quickly) but have not gotten to the end.
so ok, I was going the wrong direction with regards to the externals inside tensorflow. I'll stop heading in the direction of removing them.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
Hi @davidlt - does libtensorflow.so end up being the same thing as
/cvmfs/cms.cern.ch/slc7_amd64_gcc630/external/py2-pippkgs_depscipy/3.0-mlhled3/lib/python2.7/site-packages/tensorflow/python/_pywrap_tensorflow.so
? (minus a python api perhaps)
On May 3, 2017, at 11:15 PM, davidlt notifications@github.com wrote:
I was thinking about tensorflow integration into CMSSW. I think, we could do a dirty and quick integration similar to what we do for Oracle Instant Client and CUDA.
We can grab official binary builds and put that into CMSSW.
$ tar xvzf libtensorflow-cpu-linux-x86_64-1.1.0.tar.gz ./ ./include/ ./include/tensorflow/ ./include/tensorflow/c/ ./include/tensorflow/c/c_api.h ./lib/ ./lib/libtensorflow.so ./include/tensorflow/c/LICENSE $ ldd ./lib/libtensorflow.so linux-vdso.so.1 (0x00007fff0f9ca000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f489ea2e000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f489e82a000) libm.so.6 => /lib64/libm.so.6 (0x00007f489e521000) libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f489e199000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f489df82000) libc.so.6 => /lib64/libc.so.6 (0x00007f489dbba000) /lib64/ld-linux-x86-64.so.2 (0x000055dd55c99000)
This would provide you C API, but not C++.
The only issue I see is that it's a fat binary (incl. all needed externals at specific version), thus 90MB. Symbols are not hidden/internal, e.g. from jemalloc: je_arena_cleanup. Only C (and later C++ API) should be globally visible, the rest should be hidden.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
I found tensorflow SPEC in copr repository (third-party, random user): http://copr-dist-git.fedorainfracloud.org/cgit/alonid/tensorflow/tensorflow.git/tree/tensorflow.spec?id=89062de403c2264c08c3559597a2dd878516e456
Thanks - I have basically this same recipe aside from having to change where tmp files go (defaulting to afs where quota fills quickly) but have not gotten to the end.
but to clarify, this is a spec for tensorflow+python api. we already have the equivalent in CMSSW (just not built with our compiler/python) - this doesn't bring the C (or C++) interface. But maybe that amounts to one file.
By the way, here is the recipe how I built the C++ Tensorflow API for CMSSW8, just in case you would like to build the C++ API as a CMSSW module.
export VO_CMS_SW_DIR=/cvmfs/cms.cern.ch
source $VO_CMS_SW_DIR/cmsset_default.sh
# setup environment
export SCRAM_ARCH="slc6_amd64_gcc530"
export CMSSW_VERSION="CMSSW_8_0_26_patch1"
# Variables
export JENKINSCMSSWFOLDER=$CMSSW_VERSION
echo $JENKINSCMSSWFOLDER
export JENKINSCMSSWINSTALLDIR=$PWD"/"$JENKINSCMSSWFOLDER
echo $JENKINSCMSSWINSTALLDIR
export JENKINSCMSSWSRCDIR=$JENKINSCMSSWINSTALLDIR"/src"
echo $JENKINSCMSSWSRCDIR
# create new CMSSW environment and delete old one
echo "Deleting old CMSSW install dir: " $JENKINSCMSSWINSTALLDIR
rm -rf $JENKINSCMSSWINSTALLDIR
echo "Creating new CMSSW install dir"
scram project $JENKINSCMSSWFOLDER
cd $JENKINSCMSSWSRCDIR
eval `scramv1 runtime -sh`
# Setup c++ library paths and settings
scram setup gcc-cxxcompiler
export PATH="/cvmfs/cms.cern.ch/share/overrides/bin:/cvmfs/cms.cern.ch/slc6_amd64_gcc530/cms/cmssw-patch/CMSSW_8_0_26_patch1/bin/slc6_amd64_gcc530:/cvmfs/cms.cern.ch/slc6_amd64_gcc530/cms/cmssw-patch/CMSSW_8_0_26_patch1/external/slc6_amd64_gcc530/bin:/cvmfs/cms.cern.ch/slc6_amd64_gcc530/cms/cmssw/CMSSW_8_0_26/bin/slc6_amd64_gcc530:/cvmfs/cms.cern.ch/slc6_amd64_gcc530/external/llvm/3.8.0-ikhhed/bin:/cvmfs/cms.cern.ch/slc6_amd64_gcc530/external/gcc/5.3.0/bin:/afs/cern.ch/cms/caf/scripts:/cvmfs/cms.cern.ch/common:/cvmfs/cms.cern.ch/bin:/usr/sue/sbin:/usr/sue/bin:/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin":$PATH
export LD_LIBRARY_PATH=/cvmfs/cms.cern.ch/slc6_amd64_gcc530/cms/cmssw-patch/CMSSW_8_0_26_patch1/biglib/slc6_amd64_gcc530:/cvmfs/cms.cern.ch/slc6_amd64_gcc530/cms/cmssw-patch/CMSSW_8_0_26_patch1/lib/slc6_amd64_gcc530:/cvmfs/cms.cern.ch/slc6_amd64_gcc530/cms/cmssw-patch/CMSSW_8_0_26_patch1/external/slc6_amd64_gcc530/lib:/cvmfs/cms.cern.ch/slc6_amd64_gcc530/cms/cmssw/CMSSW_8_0_26/biglib/slc6_amd64_gcc530:/cvmfs/cms.cern.ch/slc6_amd64_gcc530/cms/cmssw/CMSSW_8_0_26/lib/slc6_amd64_gcc530:/cvmfs/cms.cern.ch/slc6_amd64_gcc530/external/llvm/3.8.0-ikhhed/lib64:/cvmfs/cms.cern.ch/slc6_amd64_gcc530/external/gcc/5.3.0/lib64:/cvmfs/cms.cern.ch/slc6_amd64_gcc530/external/gcc/5.3.0/lib:$LD_LIBRARY_PATH
export LIBRARY_PATH=$LD_LIBRARY_PATH:$LIBRARY_PATH
export LDFLAGS="-Wl,-rpath,/cvmfs/cms.cern.ch/slc6_amd64_gcc530/external/gcc/5.3.0/lib64 -L /cvmfs/cms.cern.ch/slc6_amd64_gcc530/external/gcc/5.3.0/lib64 -Wl,-rpath,/cvmfs/cms.cern.ch/slc6_amd64_gcc530/external/gcc/5.3.0/lib -L /cvmfs/cms.cern.ch/slc6_amd64_gcc530/external/gcc/5.3.0/lib"
export CXX=/cvmfs/cms.cern.ch/slc6_amd64_gcc530/external/gcc/5.3.0/bin/g++
# Compiling bazel
echo "Start compiling bazel"
wget https://github.com/bazelbuild/bazel/releases/download/0.4.5/bazel-0.4.5-dist.zip
unzip bazel-0.4.5-dist.zip
chmod +x ./compile.sh
./compile.sh
echo "Finished compiling bazel"
# Compiling Tensorflow
echo "Start compiling Tensorflow"
git clone https://github.com/tensorflow/tensorflow
cd tensorflow
# Make sure Tensorflow configure runs non-interactive by setting variables
export PYTHON_BIN_PATH=/cvmfs/cms.cern.ch/slc6_amd64_gcc530/cms/cmssw-patch/CMSSW_8_0_26_patch1/external/slc6_amd64_gcc530/bin/python
export TF_NEED_MKL=0
export CC_OPT_FLAGS="-march=native"
export TF_NEED_JEMALLOC=0
export TF_NEED_GCP=0
export TF_NEED_HDFS=0
export TF_ENABLE_XLA=0
export TF_NEED_OPENCL=0
export TF_NEED_CUDA=0
# Configure Tensorflow
echo "Configure Tensorflow"
./configure
# Patch Protobuf in tensorflow to make sure that non-default gcc installation is used
echo "Patching protobuf"
wget https://raw.githubusercontent.com/mharrend/tensorflow-c--api/master/protobuf-patch.bzl
patch -p1 -i protobuf-patch.bzl
# Compile Tensorflow C++ library
echo "Compiling Tensorflow C++ library"
$JENKINSCMSSWSRCDIR/output/bazel build -s -c opt //tensorflow:libtensorflow_cc.so
cp bazel-bin/tensorflow/libtensorflow_cc.so $JENKINSCMSSWSRCDIR/libtensorflow_cc.so
# Compile Tensorflow PIP package
echo "Compiling Tensorflow PIP package"
$JENKINSCMSSWSRCDIR/output/bazel build -s -c opt //tensorflow/tools/pip_package:build_pip_package
echo "Install PIP dependencies"
wget https://bootstrap.pypa.io/get-pip.py
python get-pip.py --user
echo "Build PIP package"
bazel-bin/tensorflow/tools/pip_package/build_pip_package $JENKINSCMSSWSRCDIR/tensorflow_pkg
Using libtensorflow_cc.so
(i.e. linking, loading into CMSSW) might not be secure (and it's not according to reports in tensorflow repository) until someone writes a proper version script, or adds visibility attributes.
If, at some point, you want to test a possible solution, I'm glad to help out.
At this point I am not sure if I want to jump into it. Seems that community already identified the same (or similar issues) and will try to resolve them, but it's not a priority for them at this point. I did start compiling tensorflow and playing around, but for now stopped. There are 3 main tasks:
It is interesting, but would cost some time.
@riga, can we close this issue?
@smuzaffar Yep!
Hi!
I'm currently trying to write a CMSSW module (slc6_amd64_gcc530, CMSSW_8_0_26_patch2) that uses the Python and NumPy C API's to load and evaluate Tensorflow graphs (https://gitlab.cern.ch/mrieger/CMSSW-DNN). The "Python.h" header is included properly, but it seems like "numpy/arrayobject.h" could not be found at compile time.
I added the tools "python" and "py2-numpy" to my BuildFile.xml and could debug with "scram b -d" that python is used (
-I/cvmfs/cms.cern.ch/slc6_amd64_gcc530/external/python/2.7.11-ikhhed2/include/python2.7
) but numpy is missing. After digging deeper I saw that the py2-numpy-toolfile.spec does not define any<environment name="INCLUDE" default="..." />
tag, unlike (e.g.) the python tool.The numpy header files are located within the python module directory, e.g. here:
/cvmfs/cms.cern.ch/slc6_amd64_gcc530/external/py2-numpy/1.11.1-ikhhed2/lib/python2.7/site-packages/numpy-1.11.1-py2.7-linux-x86_64.egg/numpy/core/include
The libraries seem to be at the same location.
I can try to provide a PR with an update tool file, but I'm not sure which branch to chose to start with. Is it possible to update the configuration of a tool in the BuildFile.xml of my module for testing purposes?
Thanks for your help! I'm really looking forward to use our DNNs in CMSSW =)
PS: As it would be much simpler to directly use the tensorflow C++ API, would it also be possible to provide the tensorflow include files / libraries?