kentonl / e2e-coref

End-to-end Neural Coreference Resolution
Apache License 2.0
524 stars 173 forks source link

How to allocate GPU wth E2E method #86

Open ghost opened 4 years ago

ghost commented 4 years ago

Hi, I'm trying to run E2E method on GPU. While I noticed that code requires TensorFlow 1.0.0. How can I run it on GPU? I have already set the environment GPU=0 while it seems there is no tensorflow-GPU to allocate it.

Damcy commented 4 years ago

@Byron309 I can run the code with Tensorflow-gpu 1.14. Before running the train script, I set export GPU=0 and export CUDA_VISIBLE_DEVICES=0.

ghost commented 4 years ago

Hi, @Damcy Thank you for your reply.

I try tf-gpu==1.14 and I have run setup_all.sh. but I get the error "tensorflow.python.framework.errors_impl.NotFoundError: ./coref_kernels.so: undefined symbol: _ZTIN10tensorflow8OpKernelE" Did you meet this problem?

I'm trying the E2E method and not the high-order method.

Damcy commented 4 years ago

The cmd in setup_all.sh may be out-of-date. You can change g++ -std=c++11 -shared coref_kernels.cc -o coref_kernels.so -fPIC ${TF_CFLAGS[@]} ${TF_LFLAGS[@]} -O2 -D_GLIBCXX_USE_CXX11_ABI=0 into g++ -std=c++11 -shared coref_kernels.cc -o coref_kernels.so -fPIC ${TF_CFLAGS[@]} ${TF_LFLAGS[@]} -O2 if your gcc version is higher than 4.8.3 (I guess). Then it can generate a correct .so file.

ghost commented 4 years ago

Hi, @Damcy Thanks for your help. But I didn't find the command g++ -std=c++11 -shared coref_kernels.cc -o coref_kernels.so -fPIC ${TF_CFLAGS[@]} ${TF_LFLAGS[@]} -O2 -D_GLIBCXX_USE_CXX11_ABI=0 in the setup_all.sh.

Also I try the command g++ -std=c++11 -shared coref_kernels.cc -o coref_kernels.so -fPIC ${TF_CFLAGS[@]} ${TF_LFLAGS[@]} -O2 , but get another error:

coref_kernels.cc:4:10: fatal error: tensorflow/core/framework/op.h: No such file or directory
 #include "tensorflow/core/framework/op.h"
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.

Have you met this error before?

Damcy commented 4 years ago

g++ -std=c++11 -shared coref_kernels.cc -o coref_kernels.so -fPIC ${TF_CFLAGS[@]} ${TF_LFLAGS[@]} -O2 -D_GLIBCXX_USE_CXX11_ABI=0 is line 13 in the setup_all.sh I didn't meet this error before. I think you can make some modifications from setup_all.sh in the high-order repo.

ghost commented 4 years ago

It works! thank you!

Somehow the setup_all.sh file we are talking about is different. I try the file you share and it works.