microsoft / Deep3DFaceReconstruction

Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set (CVPRW 2019)
MIT License
2.18k stars 443 forks source link

Some problems about running train.py #151

Open seven-sent opened 3 years ago

seven-sent commented 3 years ago

Hello! Thank you for sharing your great project!

I tried to run the training code on Windows, but I got some problems.

TypeError: Input 'filename' of 'ReadFile' Op has type float32 that does not match expected type of string.

Can I run train.py on Windows?

On the other hand, I tried to run the code on other computer using Linux. However, the computer doesn't seem to support CUDA9.0. That is, I cannot run the code with tensorflow 1.12.

Are there any other higher versions of tensorflow available? Or could you give me some advice?

YuDeng commented 3 years ago

Hi, our training code is not supported on Windows. We have tested our code under tensorflow 1.13 and 1.14. You can select one of these versions according to your own environment.

seven-sent commented 3 years ago

Thank you for your reply! I'll try again. But I still have some questions.

If I install tensorflow 1.13 or 1.14 using pip, it means that I need to recompile the binary file (rasterize_triangles_kernel.so). Is the method the same as using conda?

git clone https://github.com/google/tf_mesh_renderer.git cd tf_mesh_renderer git checkout ba27ea1798 git checkout master WORKSPACE bazel test ...

YuDeng commented 3 years ago

Yes, you have to recompile the binary file. It should follow the same way as shown in the readme.

seven-sent commented 3 years ago

Thanks for your reply. I created an environment with CUDA 10.0 and Cudnn 7.6.5, and installed tensorflow 1.13.1. But some problems came up when I followed the readme to compile the binary file rasterize_triangles_kernel.so.

When I ran bazel test ... (the bazel version is 1.2.1) the first test //mesh_renderer/kernels:rasterize_triangles_impl_test passed and others failed. I found the complied file bazel-bin/mesh_renderer/kernels/rasterize_triangles_kernel.so and copy it to renderer/. But when I ran demo.py, an error came up.

tensorflow.python.framework.errors_impl.NotFoundError: ./renderer/rasterize_triangles_kernel.so: undefined symbol: _ZN10tensorflow12OpDefBuilder4AttrENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE

I have followed the readme that set -D_GLIBCXX_USE_CXX11_ABI=0 to -D_GLIBCXX_USE_CXX11_ABI=1. How can I solve it?

YuDeng commented 3 years ago

Well, can you try reset -D_GLIBCXX_USE_CXX11_ABI=0 and rebuild the Ops?

seven-sent commented 3 years ago

Thank you very much for your reply. I'm sorry to have been interrupting you.

I reset -D_GLIBCXX_USE_CXX11_ABI=0 and rebuilt the file. The error was solved, but another error occured.

2021-04-20 12:50:46.278339: E tensorflow/stream_executor/cuda/cuda_blas.cc:698] failed to run cuBLAS routine cublasSgemm_v2: CUBLAS_STATUS_EXECUTION_FAILED

Maybe I should try other version of tensorflow.