marian-nmt / marian-dev

Fast Neural Machine Translation in C++ - development repository
https://marian-nmt.github.io
Other
257 stars 127 forks source link

Which protobuf version does USE_ONNX require/support? #943

Open alvations opened 2 years ago

alvations commented 2 years ago

When trying to install marian with -DUSE_ONNX, it didn't compile.

Dependencies Installation

With a fresh ubuntu instance:

sudo apt update

sudo apt install git cmake build-essential libboost-system-dev libprotobuf17 protobuf-compiler libprotobuf-dev openssl libssl-dev libgoogle-perftools-dev

sudo apt install 

pip install onnxruntime

git clone https://github.com/marian-nmt/marian
mkdir marian/build
cd marian/build

Ubuntu version:

~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 20.04.4 LTS
Release:    20.04
Codename:   focal

Context

cmake .. -DUSE_STATIC_LIBS=on -DUSE_SENTENCEPIECE=on -DUSE_FBGEMM=on -DCOMPILE_CUDA=off -DCOMPILE_CPU=on -DUSE_ONNX=ON
 80%] Building CXX object src/CMakeFiles/marian.dir/3rd_party/ExceptionWithCallStack.cpp.o
[ 81%] Building CXX object src/CMakeFiles/marian.dir/3rd_party/onnx/protobuf/onnx-ml.pb-wrapper.cpp.o
In file included from /home/ubuntu/marian/src/3rd_party/onnx/protobuf/onnx-ml.pb.cc:4,
                 from /home/ubuntu/marian/src/3rd_party/onnx/protobuf/onnx-ml.pb-wrapper.cpp:29:
/home/ubuntu/marian/src/3rd_party/onnx/protobuf/onnx-ml.pb.h:12:2: error: #error This file was generated by a newer version of protoc which is
   12 | #error This file was generated by a newer version of protoc which is
      |  ^~~~~
/home/ubuntu/marian/src/3rd_party/onnx/protobuf/onnx-ml.pb.h:13:2: error: #error incompatible with your Protocol Buffer headers. Please update
   13 | #error incompatible with your Protocol Buffer headers. Please update
      |  ^~~~~
/home/ubuntu/marian/src/3rd_party/onnx/protobuf/onnx-ml.pb.h:14:2: error: #error your headers.
   14 | #error your headers.
      |  ^~~~~
In file included from /home/ubuntu/marian/src/3rd_party/sentencepiece/third_party/protobuf-lite/google/protobuf/io/coded_stream.h:135,
                 from /home/ubuntu/marian/src/3rd_party/onnx/protobuf/onnx-ml.pb.h:23,
                 from /home/ubuntu/marian/src/3rd_party/onnx/protobuf/onnx-ml.pb.cc:4,
                 from /home/ubuntu/marian/src/3rd_party/onnx/protobuf/onnx-ml.pb-wrapper.cpp:29:
/home/ubuntu/marian/src/3rd_party/sentencepiece/third_party/protobuf-lite/google/protobuf/stubs/common.h:111: error: "GOOGLE_PROTOBUF_MIN_LIBRARY_VERSION" redefined [-Werror]
  111 | #define GOOGLE_PROTOBUF_MIN_LIBRARY_VERSION 3006001
      | 
In file included from /home/ubuntu/marian/src/3rd_party/onnx/protobuf/onnx-ml.pb.h:10,
                 from /home/ubuntu/marian/src/3rd_party/onnx/protobuf/onnx-ml.pb.cc:4,
                 from /home/ubuntu/marian/src/3rd_party/onnx/protobuf/onnx-ml.pb-wrapper.cpp:29:
/usr/include/google/protobuf/port_def.inc:307: note: this is the location of the previous definition
  307 | #define GOOGLE_PROTOBUF_MIN_LIBRARY_VERSION 3011000
      | 
In file included from /usr/include/google/protobuf/descriptor.h:65,
                 from /usr/include/google/protobuf/generated_message_reflection.h:47,
                 from /home/ubuntu/marian/src/3rd_party/onnx/protobuf/onnx-ml.pb.h:30,
                 from /home/ubuntu/marian/src/3rd_party/onnx/protobuf/onnx-ml.pb.cc:4,
                 from /home/ubuntu/marian/src/3rd_party/onnx/protobuf/onnx-ml.pb-wrapper.cpp:29:
/usr/include/google/protobuf/port_def.inc:74:2: error: #error PROTOBUF_DEPRECATED was previously defined
   74 | #error PROTOBUF_DEPRECATED was previously defined
      |  ^~~~~
In file included from /usr/include/google/protobuf/metadata.h:42,
                 from /usr/include/google/protobuf/generated_message_reflection.h:49,
                 from /home/ubuntu/marian/src/3rd_party/onnx/protobuf/onnx-ml.pb.h:30,
                 from /home/ubuntu/marian/src/3rd_party/onnx/protobuf/onnx-ml.pb.cc:4,
                 from /home/ubuntu/marian/src/3rd_party/onnx/protobuf/onnx-ml.pb-wrapper.cpp:29:
/usr/include/google/protobuf/unknown_field_set.h:289:26: error: ‘google::protobuf::io::EpsCopyOutputStream’ has not been declared
  289 |       uint8* target, io::EpsCopyOutputStream* stream) const;
      |                        
snukky commented 2 years ago

I don't know which versions are supported, but this may be potentially helpful: https://github.com/marian-nmt/marian-dev/issues/900#issuecomment-1015127934

gregtatum commented 4 months ago

I did some investigation into this. It requires v3.6.0 to match sentencepiece.

https://github.com/protocolbuffers/protobuf/releases/tag/v3.6.0

I added this to my docker config to deal with it:

# Get the header files for the protobuf version that matches sentencepiece in Marian.
# e.g. if the marian build complains about the C++ define: GOOGLE_PROTOBUF_MIN_LIBRARY_VERSION 3006001
# This 3006001 maps to version v3.6.1. This checkout will need to change.
RUN git clone --branch v3.6.1.3 --single-branch https://github.com/protocolbuffers/protobuf.git && \
    rm -rf /usr/include/google && \
    ln -s $(pwd)/protobuf/src/google /usr/include

# Get the correct version of the Protobuf Compiler.
#
# protoc-3.6.1-linux-x86_64.zip
# ├── bin
# │   └── protoc
# └── include
RUN wget https://github.com/protocolbuffers/protobuf/releases/download/v3.6.1/protoc-3.6.1-linux-x86_64.zip && \
    unzip protoc-3.6.1-linux-x86_64.zip -d protoc && \
    rm protoc-3.6.1-linux-x86_64.zip && \
    mv protoc/bin/protoc $BIN

RUN git clone https://github.com/marian-nmt/marian-dev.git marian-dev

# Re-build the onnx model definitions with the correct version of
RUN cd marian-dev/src/3rd_party/onnx/protobuf && \
    ${BIN}/protoc onnx-ml.proto --cpp_out .
gregtatum commented 3 months ago

Actually, this needs the library built and linked as well.

Something like:

RUN git clone --branch v3.6.1.3 --single-branch https://github.com/protocolbuffers/protobuf.git && \
    ./compile-protoc.sh ./protobuf

compile-protoc.sh

#!/bin/bash
set -ex

##
# Usage: ./compile-protoc.sh ./protobuf
#

test -v BIN

# This follows the instructions from: https://github.com/protocolbuffers/protobuf/blob/v3.14.0/src/README.md
cd $1
git submodule update --init --recursive
./autogen.sh

./configure
make
make check
make install
ldconfig # refresh shared library cache.

# Now that we're done, copy the bin.
cp ./src/protoc "${BIN}"

# Link in the includes.
ln -s $(pwd)/src/google /usr/include/google