Ice-Panda commented 5 years ago

Preface

What a good news that I've been already worked out the method to run demo.py on windows with GPU. Let's take a look at the results. Firstly, thanks to the author @eragonruan , #43 zhao181, #73 ZhuMingmin9123 and amberblade/gpu_nms_windows.

my hardware is: NVIDIA GTX970M

my environment is: windows 10 , python 3.6 , tensorflow-gpu 1.10, CUDA 9.0, cuDNN for CUDA 9.0, vs2017

These are proven to work properly. You can also download and install the corresponding version of CUDA and cuDNN for your own hardware. Due to version incompatibility between tensorflow and CUDA , please select the appropriate version to download and install after inquiry. The specific process is no more explained here. We need to compile with the following tools:

Let's do it

Some steps are the same as #73

step 1:make some change

cython_nms.pyx

Change "np.int_t" to "np.intp_t" in line 25 of the file lib\utils\cython_nms.pyx. Just one in this line, other "np.int_t" does not need to be changed. Otherwise appear " ValueError: Buffer dtype mismatch, expected 'int_t' but got 'long long' " in step 6.

gpu_nms.cpp

Change the generated file "gpu_nms.cpp" in line 2150:

_nms((&(*__Pyx_BufPtrStrided1d(__pyx_t_5numpy_int32_t *, __pyx_pybuffernd_keep.rcbuffer->pybuffer.buf, __pyx_t_10, __pyx_pybuffernd_keep.diminfo[0].strides))), (&__pyx_v_num_out), (&(*__Pyx_BufPtrStrided2d(__pyx_t_5numpy_float32_t *, __pyx_pybuffernd_sorted_dets.rcbuffer->pybuffer.buf, __pyx_t_12, __pyx_pybuffernd_sorted_dets.diminfo[0].strides, __pyx_t_13, __pyx_pybuffernd_sorted_dets.diminfo[1].strides))), __pyx_v_boxes_num, __pyx_v_boxes_dim, __pyx_t_14, __pyx_v_device_id);

to:

_nms((&(*__Pyx_BufPtrStrided1d(int *, __pyx_pybuffernd_keep.rcbuffer->pybuffer.buf, __pyx_t_10, __pyx_pybuffernd_keep.diminfo[0].strides))), (&__pyx_v_num_out), (&(*__Pyx_BufPtrStrided2d(__pyx_t_5numpy_float32_t *, __pyx_pybuffernd_sorted_dets.rcbuffer->pybuffer.buf, __pyx_t_12, __pyx_pybuffernd_sorted_dets.diminfo[0].strides, __pyx_t_13, __pyx_pybuffernd_sorted_dets.diminfo[1].strides))), __pyx_v_boxes_num, __pyx_v_boxes_dim, __pyx_t_14, __pyx_v_device_id);

Otherwise the following error occurred in step 4:

gpu_nms.cpp(2147): error C2664: 'void _nms(int *,int *,const float *,int,int,float,int)': cannot convert argument 1 from '__pyx_t_5numpy_int32_t *' to 'int *'

step 2:updata c file

execute:cd your_dir\text-detection-ctpn-master\lib\utils execute:cython bbox.pyx execute:cython cython_nms.pyx execute:cython gpu_nms.pyx

step 3:create a new file named setup_new.py

import numpy as np
from distutils.core import setup
from Cython.Build import cythonize
from distutils.extension import Extension
numpy_include = np.get_include()
setup(ext_modules=cythonize("bbox.pyx"),include_dirs=[numpy_include])
setup(ext_modules=cythonize("cython_nms.pyx"),include_dirs=[numpy_include])

step 4:build .pyd file

setup_new.py

The Python compiled version must be consistent with the runtime version execute:python setup_new.py install copy "bbox.cp36-win_amd64.pyd" and "cython_nms.cp36-win_amd64.pyd" to "your_dir\text-detection-ctpn-master\lib\utils"

setup.py

Before proceeding with this step, please be sure to add the "bin" folder containing compilation-related files such as "cl.exe" to the "System Environment Variable"

rewrite setup.py

Rewrite the file "setup.py", and all you need to do is modify the directory of cuda_libs and include_dirs

#!/usr/bin/env python

import numpy as np
import os
# on Windows, we need the original PATH without Anaconda's compiler in it:
PATH = os.environ.get('PATH')
print(PATH)
from distutils.spawn import spawn, find_executable
from setuptools import setup, find_packages, Extension
from setuptools.command.build_ext import build_ext
import sys

# CUDA specific config
# nvcc is assumed to be in user's PATH
nvcc_compile_args = ['-O', '--ptxas-options=-v', '-arch=sm_35', '-c', '--compiler-options', '-fPIC', '--shared']
nvcc_compile_args = os.environ.get('NVCCFLAGS', '').split() + nvcc_compile_args
cuda_libs = [r'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\lib\x64\cublas', r"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\lib\x64\cudart"]

# Obtain the numpy include directory.  This logic works across numpy versions.
try:
    numpy_include = np.get_include()
except AttributeError:
    numpy_include = np.get_numpy_include()

cudamat_ext = Extension('gpu_nms',
                        sources=['gpu_nms.pyx', 'nms_kernel.cu'],
                        language='c++',
                        libraries=cuda_libs,
                        runtime_library_dirs=[],
                        extra_compile_args=nvcc_compile_args,
                        include_dirs = [numpy_include, r'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\include'])

class CUDA_build_ext(build_ext):
    """
    Custom build_ext command that compiles CUDA files.
    Note that all extension source files will be processed with this compiler.
    """
    def build_extensions(self):
        self.compiler.src_extensions.append('.cu')
        self.compiler.set_executable('compiler_so', 'nvcc')
        self.compiler.set_executable('linker_so', 'nvcc --shared')
        if hasattr(self.compiler, '_c_extensions'):
            self.compiler._c_extensions.append('.cu')  # needed for Windows
        self.compiler.spawn = self.spawn
        build_ext.build_extensions(self)

    def spawn(self, cmd, search_path=1, verbose=0, dry_run=0):
        """
        Perform any CUDA specific customizations before actually launching
        compile/link etc. commands.
        """
        if (sys.platform == 'darwin' and len(cmd) >= 2 and cmd[0] == 'nvcc' and
                cmd[1] == '--shared' and cmd.count('-arch') > 0):
            # Versions of distutils on OSX earlier than 2.7.9 inject
            # '-arch x86_64' which we need to strip while using nvcc for
            # linking
            while True:
                try:
                    index = cmd.index('-arch')
                    del cmd[index:index+2]
                except ValueError:
                    break
        elif self.compiler.compiler_type == 'msvc':
            # There are several things we need to do to change the commands
            # issued by MSVCCompiler into one that works with nvcc. In the end,
            # it might have been easier to write our own CCompiler class for
            # nvcc, as we're only interested in creating a shared library to
            # load with ctypes, not in creating an importable Python extension.
            # - First, we replace the cl.exe or link.exe call with an nvcc
            #   call. In case we're running Anaconda, we search cl.exe in the
            #   original search path we captured further above -- Anaconda
            #   inserts a MSVC version into PATH that is too old for nvcc.

            is_cu_file = False
            for t in cmd:
                if t[-3:] == '.cu':
                    is_cu_file = True
                    break
            if is_cu_file:
                cmd[:1] = ['nvcc', '--compiler-bindir',
                           os.path.dirname(find_executable("cl.exe", PATH))
                           or cmd[0]]
                # - Secondly, we fix a bunch of command line arguments.
                for idx, c in enumerate(cmd):
                    # create .dll instead of .pyd files
                    #if '.pyd' in c: cmd[idx] = c = c.replace('.pyd', '.dll')  #20160601, by MrX
                    # replace /c by -c
                    if c == '/c': cmd[idx] = '-c'
                    # replace /DLL by --shared
                    elif c == '/DLL': cmd[idx] = '--shared'
                    # remove --compiler-options=-fPIC
                    elif '-fPIC' in c: del cmd[idx]
                    # replace /Tc... by ...
                    elif c.startswith('/Tc'): cmd[idx] = c[3:]
                    # replace /Fo... by -o ...
                    elif c.startswith('/Fo'): cmd[idx:idx+1] = ['-o', c[3:]]
                    # replace /LIBPATH:... by -L...
                    elif c.startswith('/LIBPATH:'): cmd[idx] = '-L' + c[9:]
                    # replace /OUT:... by -o ...
                    elif c.startswith('/OUT:'): cmd[idx:idx+1] = ['-o', c[5:]]
                    # remove /EXPORT:initlibcudamat or /EXPORT:initlibcudalearn
                    elif c.startswith('/EXPORT:'): del cmd[idx]
                    # replace cublas.lib by -lcublas
                    elif c == 'cublas.lib': cmd[idx] = '-lcublas -lcudart.lib'
                # - Finally, we pass on all arguments starting with a '/' to the
                #   compiler or linker, and have nvcc handle all other arguments
                if '--shared' in cmd:
                    pass_on = '--linker-options='
                    # we only need MSVCRT for a .dll, remove CMT if it sneaks in:
                    cmd.append('/NODEFAULTLIB:libcmt.lib')
                else:
                    pass_on = '--compiler-options='

                cmd.append('/NODEFAULTLIB:libcmt.lib')
                cmd = ([c for c in cmd if c[0] != '/'] +
                       [pass_on + ','.join(c for c in cmd if c[0] == '/')])
                # For the future: Apart from the wrongly set PATH by Anaconda, it
                # would suffice to run the following for compilation on Windows:
                # nvcc -c -O -o <file>.obj <file>.cu
                # And the following for linking:
                # nvcc --shared -o <file>.dll <file1>.obj <file2>.obj -lcublas
                # This could be done by a NVCCCompiler class for all platforms.
            else:
                cmd.append('/MT')
                pass
        spawn(cmd, search_path, verbose, dry_run)

setup(name="gpu_nms",
      description="Performs NMS computation on the GPU via CUDA",
      ext_modules=[cudamat_ext],
      cmdclass={'build_ext': CUDA_build_ext},
)

build gpu_nms.cp36-win_amd64.pyd

Open "VS2015 x64 本机工具命令提示符", activate your virtual environment execute:cd your_dir\text-detection-ctpn-master\lib\utils execute:python setup.py build_ext --inplace If you see the picture as follows, that means you've succeeded. Enjoy it~

step 5:make some change

change "base_name = image_name.split('/')[-1]" to "base_name = image_name.split('\\')[-1]" in line 24 of the file "ctpn\demo.py"

step 6:run demo

execute:cd your_dir\text-detection-ctpn-master execute:python ./ctpn/demo.py

krsad commented 5 years ago

First of all, thank you to write a helpfull installation guide. I had an error. Please help on that. Step 4 ended with an error for me. Here is the ss: screenshot_3

I have made the path like that : Please help me about it

Ice-Panda commented 5 years ago

@krsad Hi, I think you can change the "cl.exe" dir path like this. And please ensure the priority of 'cl.exe' dir path~

opentld commented 5 years ago

First of all, thank you to write a helpfull installation guide. I had an error. Please help on that. Step 4 ended with an error for me. Here is the ss:

I have made the path like that :

Please help me about it

do not run python under anaconda, just run under visualstudio Command Prompt

a429367172 commented 5 years ago

Thanks for giving us the advice! But I had tried a lot and eventually got stuck in steps 4, and when I open "vs2015 x64 本机工具命令提示符", the problem arise: Please help me out! Thanks a lot!

ghost commented 5 years ago

First of all, thank you to write a helpfull installation guide. I had an error. Please help on that. Step 4 ended with an error for me. Here is the ss:

I have made the path like that : Please help me about it

Hi, have you figure it out? I have that problem too. And I have tried the ways @Ice-Panda and @opentld mentioned about, it still have that problem.

qwarmq commented 5 years ago

你好，必须用VS2015吗？如果用VS2013可以吗？

Ice-Panda commented 5 years ago

@qwarmq Yes, you have to use vs2015.

qwarmq commented 5 years ago

QQ图片20190528110951 你好我想问一下，这个问题该怎么办？

qwarmq commented 5 years ago

你好我想问一下，这个问题该怎么办？

First of all, thank you to write a helpfull installation guide. I had an error.I need your help urgently now! thank you~

qwarmq commented 5 years ago

QQ图片20190528141510 I have the error and want help!@Ice-Panda

ZenCodeXY commented 5 years ago

你好，必须用VS2015吗？如果用VS2017可以吗？ TIM图片20191105164109

Ice-Panda commented 5 years ago

你好，必须用VS2015吗？如果用VS2017可以吗？我安装的是vs2019的，同时勾选了2015编译环境，实际使用2015可以编译成功

Ice-Panda commented 5 years ago

I have the error and want help!@Ice-Panda Please make sure that CUDA is installed correctly.

Ice-Panda commented 5 years ago

I have the error and want help!@Ice-Panda

环境变量没配置好，检查一下那个cl.exe的环境变量路径，我之前也卡在这一步

ZenCodeXY commented 5 years ago

你好，必须用VS2015吗？如果用VS2017可以吗？我安装的是vs2019的，同时勾选了2015编译环境，实际使用2015可以编译成功

非常感谢，这个问题按你方法解决了，但是编译结束提示这个错误 TIM图片20191106105305 我尝试了将Windows Kits/8.1/x64下的rc.exe和rcdll.dll两个文件拷贝到Microsoft Visual Studio 14.0\VC\bin目录下，仍然没有解决？想问下是否有其他解决方法呢？再次感谢！

eragonruan / text-detection-ctpn

Run demo.py with GPU on the Platform of Windows #264

Preface

Let's do it

step 1:make some change

cython_nms.pyx

gpu_nms.cpp

step 2:updata c file

step 3:create a new file named setup_new.py

step 4:build .pyd file

setup_new.py

setup.py

rewrite setup.py

build gpu_nms.cp36-win_amd64.pyd

step 5:make some change

step 6:run demo