dmlc / gluon-nlp

NLP made easy
https://nlp.gluon.ai/
Apache License 2.0
2.55k stars 538 forks source link

[Numpy] Fix AWS Batch + Add Docker Support #1302

Closed sxjscience closed 4 years ago

sxjscience commented 4 years ago
docker pull gluonai/gluon-nlp:v1.0.0
docker run --gpus all --rm -it -p 8888:8888 -p 8787:8787 -p 8786:8786 gluonai/gluon-nlp:v1.0.0

@dmlc/gluon-nlp-committers

Should solve https://github.com/dmlc/gluon-nlp/issues/1243 and https://github.com/dmlc/gluon-nlp/issues/1139

codecov[bot] commented 4 years ago

Codecov Report

Merging #1302 into master will increase coverage by 0.15%. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1302      +/-   ##
==========================================
+ Coverage   84.14%   84.30%   +0.15%     
==========================================
  Files          42       42              
  Lines        6397     6397              
==========================================
+ Hits         5383     5393      +10     
+ Misses       1014     1004      -10     
Impacted Files Coverage Δ
src/gluonnlp/__init__.py 100.00% <100.00%> (ø)
src/gluonnlp/utils/misc.py 50.63% <0.00%> (-1.27%) :arrow_down:
src/gluonnlp/data/loading.py 83.39% <0.00%> (+5.28%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 32e87d4...dbc34be. Read the comment docs.

sxjscience commented 4 years ago

@leezu I tried to compile with the latest MXNet and install horovod via Haibin's branch. However, I'm seeing this error message:

/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/crti.o: In function `_init':
(.init+0x7): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against undefined symbol `__gmon_start__'
CMakeFiles/nnvm.dir/3rdparty/tvm/nnvm/src/pass/print_graph_ir.cc.o: In function `std::_Function_base::_Base_manager<nnvm::pass::PrintGraphIR_(nnvm::Graph, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::ostream&)::{lambda(unsigned int, std::ostream&)#2}>::_M_manager(std::_Any_data&, std::_Function_base::_Base_manager<nnvm::pass::PrintGraphIR_(nnvm::Graph, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::ostream&)::{lambda(unsigned int, std::ostream&)#2}> const&, std::_Manager_operation)':
print_graph_ir.cc:(.text+0x3bb): relocation truncated to fit: R_X86_64_PC32 against `.data.rel.ro'
CMakeFiles/nnvm.dir/3rdparty/tvm/nnvm/src/pass/print_graph_ir.cc.o: In function `std::_Function_base::_Base_manager<nnvm::pass::PrintGraphIR_(nnvm::Graph, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::ostream&)::{lambda(unsigned int, std::ostream&)#1}>::_M_manager(std::_Any_data&, std::_Function_base::_Base_manager<nnvm::pass::PrintGraphIR_(nnvm::Graph, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::ostream&)::{lambda(unsigned int, std::ostream&)#1}> const&, std::_Manager_operation)':
print_graph_ir.cc:(.text+0x59b): relocation truncated to fit: R_X86_64_PC32 against `.data.rel.ro'
CMakeFiles/nnvm.dir/3rdparty/tvm/nnvm/src/pass/print_graph_ir.cc.o: In function `nnvm::pass::GetVectorPrinter(nnvm::Graph const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)':
print_graph_ir.cc:(.text+0x6d8): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `typeinfo name for std::vector<nnvm::TShape, std::allocator<nnvm::TShape> >' defined in .rodata._ZTSSt6vectorIN4nnvm6TShapeESaIS1_EE[_ZTSSt6vectorIN4nnvm6TShapeESaIS1_EE] section in CMakeFiles/nnvm.dir/3rdparty/tvm/nnvm/src/pass/infer_shape_type.cc.o
print_graph_ir.cc:(.text+0x708): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `typeinfo name for std::vector<int, std::allocator<int> >' defined in .rodata._ZTSSt6vectorIiSaIiEE[_ZTSSt6vectorIiSaIiEE] section in CMakeFiles/nnvm.dir/3rdparty/tvm/nnvm/src/pass/infer_shape_type.cc.o
print_graph_ir.cc:(.text+0x72b): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `typeinfo name for std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >' defined in .rodata._ZTSSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS5_EE[_ZTSSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS5_EE] section in CMakeFiles/nnvm.dir/3rdparty/tvm/nnvm/src/pass/print_graph_ir.cc.o
print_graph_ir.cc:(.text+0x75b): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `vtable for std::basic_ios<char, std::char_traits<char> >@@GLIBCXX_3.4' defined in .data.rel.ro section in /usr/lib/gcc/x86_64-linux-gnu/7/libstdc++.so
print_graph_ir.cc:(.text+0x7a0): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `VTT for std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >@@GLIBCXX_3.4.21' defined in .data.rel.ro section in /usr/lib/gcc/x86_64-linux-gnu/7/libstdc++.so
print_graph_ir.cc:(.text+0x7c5): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `vtable for std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >@@GLIBCXX_3.4.21' defined in .data.rel.ro section in /usr/lib/gcc/x86_64-linux-gnu/7/libstdc++.so
print_graph_ir.cc:(.text+0x7e3): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `vtable for std::basic_streambuf<char, std::char_traits<char> >@@GLIBCXX_3.4' defined in .data.rel.ro section in /usr/lib/gcc/x86_64-linux-gnu/7/libstdc++.so
print_graph_ir.cc:(.text+0x816): additional relocation overflows omitted from the output

Thus, I reverted to use the mxnet wheel package instead and commented out the codes related to horovod. Would you approve it if you feel that it's appropriate?

sxjscience commented 4 years ago

Horovod should have been added to the dockerfile.