nagadomi / waifu2x

Image Super-Resolution for Anime-Style Art
http://waifu2x.udp.jp/
MIT License
27.33k stars 2.71k forks source link

cc1plus not found, luarocks failed to install, and can't find CUDA #372

Open dark-swordsman opened 3 years ago

dark-swordsman commented 3 years ago

Hello,

I know it's probably a terrible idea, but I am trying to get Waifu2X working on WSL 2 with Ubuntu 20.04. I once tried to install CUDA onto WSL, but it was failing and I gave up quick.

This time, I tried installing CUDA 10 to Windows 10 and am currently in the process of running ./install_lua_modules.sh. When I run it, it shows the following:

CMake Error at /usr/share/cmake-3.16/Modules/CMakeTestCXXCompiler.cmake:53 (message):
  The C++ compiler

    "/usr/bin/c++"

  is not able to compile a simple test program.

  It fails with the following output:

    Change Dir: /tmp/luarocks_cutorch-scm-1-3303/cutorch/build/CMakeFiles/CMakeTmp

    Run Build Command(s):/usr/bin/make cmTC_19317/fast && /usr/bin/make -f CMakeFiles/cmTC_19317.dir/build.make CMakeFiles/cmTC_19317.dir/build
    make[1]: Entering directory '/tmp/luarocks_cutorch-scm-1-3303/cutorch/build/CMakeFiles/CMakeTmp'
    Building CXX object CMakeFiles/cmTC_19317.dir/testCXXCompiler.cxx.o
    /usr/bin/c++     -o CMakeFiles/cmTC_19317.dir/testCXXCompiler.cxx.o -c /tmp/luarocks_cutorch-scm-1-3303/cutorch/build/CMakeFiles/CMakeTmp/testCXXCompiler.cxx
    c++: error trying to exec 'cc1plus': execvp: No such file or directory
    make[1]: *** [CMakeFiles/cmTC_19317.dir/build.make:66: CMakeFiles/cmTC_19317.dir/testCXXCompiler.cxx.o] Error 1
    make[1]: Leaving directory '/tmp/luarocks_cutorch-scm-1-3303/cutorch/build/CMakeFiles/CMakeTmp'
    make: *** [Makefile:121: cmTC_19317/fast] Error 2

  CMake will not be able to correctly generate this project.

-- Configuring incomplete, errors occurred!
See also "/tmp/luarocks_cutorch-scm-1-3303/cutorch/build/CMakeFiles/CMakeOutput.log".
See also "/tmp/luarocks_cutorch-scm-1-3303/cutorch/build/CMakeFiles/CMakeError.log".

Error: Failed installing dependency: https://raw.githubusercontent.com/torch/rocks/master/cutorch-scm-1.rockspec - Build error: Failed building.
Installing https://raw.githubusercontent.com/rocks-moonscript-org/moonrocks-mirror/master/turbo-2.1-2.rockspec...
Using https://raw.githubusercontent.com/rocks-moonscript-org/moonrocks-mirror/master/turbo-2.1-2.rockspec... switching to 'build' mode
Cloning into 'turbo'...

So the main two things:

Running th waifu2x.lua causes the following:

/home/dark/torch/install/bin/luajit: /home/dark/torch/install/share/lua/5.1/trepl/init.lua:389: lib/w2nn.lua:52: Failed to load CUDA modules. Please check the CUDA Settings.
---
/home/dark/torch/install/share/lua/5.1/trepl/init.lua:389: module 'cutorch' not found:No LuaRocks module found for cutorch
        no field package.preload['cutorch']
        no file 'lib/cutorch.lua'
        no file '/home/dark/.luarocks/share/lua/5.1/cutorch.lua'
        no file '/home/dark/.luarocks/share/lua/5.1/cutorch/init.lua'
        no file '/home/dark/torch/install/share/lua/5.1/cutorch.lua'
        no file '/home/dark/torch/install/share/lua/5.1/cutorch/init.lua'
        no file './cutorch.lua'
        no file '/home/dark/torch/install/share/luajit-2.1.0-beta1/cutorch.lua'
        no file '/usr/local/share/lua/5.1/cutorch.lua'
        no file '/usr/local/share/lua/5.1/cutorch/init.lua'
        no file '/home/dark/.luarocks/lib/lua/5.1/cutorch.so'
        no file '/home/dark/torch/install/lib/lua/5.1/cutorch.so'
        no file '/home/dark/torch/install/lib/cutorch.so'
        no file './cutorch.so'
        no file '/usr/local/lib/lua/5.1/cutorch.so'
        no file '/usr/local/lib/lua/5.1/loadall.so'
stack traceback:
        [C]: in function 'error'
        /home/dark/torch/install/share/lua/5.1/trepl/init.lua:389: in function 'require'
        waifu2x.lua:5: in main chunk
        [C]: in function 'dofile'
        ...dark/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
        [C]: at 0x559933a23f50

Here it seems that it can't find CUDA even though I installed 10.0 via this since the original install through sudo apt install cuda was downloading 11.2.

I figure some of this may be due to it being WSL 2, but I was able to solve the majority of the other issues I had. Just got stuck on this, and running the following to solve cc1plus doesn't work.

sudo apt-get update
sudo apt-get install --reinstall build-essential // works, but doesn't fix the issue

or

sudo apt-get install --reinstall g++-4.6 // results in "Unable to locate package g++-4.6"

I was able to override gcc to version 7.5.0 because of an error about it being 8 or higher (it was 9). Though there's g++ and gcc. I set them both to:

g++ (Ubuntu 7.5.0-6ubuntu2) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

This is running on Ubuntu 20.04 on WSL 2.

Let me know if I need to provide any other information.

nagadomi commented 3 years ago

You need to install torch7 first. https://github.com/nagadomi/distro There are reports that waifu2x worked on WSL2, but I haven't confirmed it. https://github.com/nagadomi/waifu2x/issues/354

dark-swordsman commented 3 years ago

I already have torch7 installed. image

I installed it through this method because the method listed on the readme was not working.

I guess the difference is that I just did ./install.sh instead of ./clean.sh and ./update.sh.

I will try the recommendations in that issue you listed and report back.

nagadomi commented 3 years ago

The above error is occurring while trying to install the required cutorch library as a dependency. When installing torch7, cutorch should be installed, but it is not installed. Maybe CUDA is not detected when building torch7 then cutorch is not built. It's the same problem as #354.

You can check it with the following command.

th -e "require 'cutorch'"

If cutorch is installed successfully, it will exit without displaying any message.

dark-swordsman commented 3 years ago

So I made a little progress, but my patience is growing thin. I feel like my WSL 2 instance is nothing like any other instance that anyone else has. Ever since I upgraded to WSL 2, it feels like everything broke and all I can do now is run Minecraft servers or use git.

Running th -e "require 'cutorch'" revealed that it, in fact, was not found. I am not exactly sure of the steps I took to fix it, but I eventually fixed it and NVCC was able to compile. It was something along the lines of:

image

It now shows:

THCudaCheck FAIL file=/home/dark/torch/extra/cutorch/lib/THC/THCGeneral.c line=70 error=38 : no CUDA-capable device is detected
/home/dark/torch/install/share/lua/5.1/trepl/init.lua:389: cuda runtime error (38) : no CUDA-capable device is detected at /home/dark/torch/extra/cutorch/lib/THC/THCGeneral.c:70

I tried the following guides, to no avail:

The only other issue I noticed is that I was unable to install ipython or something when doing ./install-deps, but no errors came from the build process of torch.

I think I will give up on this again and pick up on it some other time, as at this point, it's yet another WSL issue that I've encountered since upgrading to WSL 2 that is literally unfixable.

nagadomi commented 3 years ago

no CUDA-capable device is detected

GPU driver doesn't seem to detect the GPU device. There are reports that driver installed by recent windows update do not work. https://github.com/microsoft/WSL/issues/6014#issuecomment-730526767

CarbonPool commented 3 years ago

wsl2 is virtualized and shared, and I seem to be able to export my completed image for sharing.

This is my current environment:

windows10:

 ·Version: windows10 insider preview 20279
 ·Nvidia driver: 465.12

windows subsystem for linux:

· Version: Ubuntu 18.0.4 · Kernel: 5.4.72 · cmake: 3.18.0 · gcc: 7.5.0 · cuda: 10.2 *(package cuda-toolkit-10-2)

It works very well recently, but recently Microsoft released the new 21277, I tried to upgrade, even if it works, but the efficiency of waifu2x is lower than the previous branch version, 20279 is the more stable FE dev branch I currently use.

I suggest to remount the new image and start over. My image is from https://cloud-images.ubuntu.com/bionic/current/, I use this rootfs to mount it manually.

CarbonPool commented 3 years ago

In addition, please install cuda correctly according to NVIDIA’s official solution. In wsl2, you don’t need to install additional drivers, just install cuda-toolkit-10-2. Before that, you may need to update the source, please refer to https://docs.nvidia.com/cuda/wsl-user-guide/index.html#installing-wsl2

dark-swordsman commented 3 years ago

I had done the installation as intended with just the toolkit, but it wasn't working. Nothing I tried worked, so I gave up on it.

I ended up completely reinstalling Windows since I was having other bugs and issues with it. I also installed Ubuntu 20.04 alongside it so I can dual boot them and not deal with the slew of headaches and nuances that WSL provides.

I have finally finished the installation on my Ubuntu install, and when restarting, all I did was install graphicsmagick and it seems that it works now:

kyle@dark-linux-desktop:~/waifu2x$ th waifu2x.lua 
images/miku_small_noise_scale.png: 0.22278690338135 sec

Perhaps one day I will try WSL again, but it's given me so many problems that it just makes more sense to dual boot linux. I guess you can close this if you want since I will no longer be providing updates or testing with WSL.

Edit: Also, to note for anyone else that may find this: Getting it working on Ubuntu bare metal didn't require the cuda-toolkit, just needed to install the graphics driver at the start when I installed Ubuntu, and then follow the CUDA install steps in the readme.

CarbonPool commented 3 years ago

I had done the installation as intended with just the toolkit, but it wasn't working. Nothing I tried worked, so I gave up on it.

I ended up completely reinstalling Windows since I was having other bugs and issues with it. I also installed Ubuntu 20.04 alongside it so I can dual boot them and not deal with the slew of headaches and nuances that WSL provides.

I have finally finished the installation on my Ubuntu install, and when restarting, all I did was install graphicsmagick and it seems that it works now:

kyle@dark-linux-desktop:~/waifu2x$ th waifu2x.lua 
images/miku_small_noise_scale.png: 0.22278690338135 sec

Perhaps one day I will try WSL again, but it's given me so many problems that it just makes more sense to dual boot linux. I guess you can close this if you want since I will no longer be providing updates or testing with WSL.

Edit: Also, to note for anyone else that may find this: Getting it working on Ubuntu bare metal didn't require the cuda-toolkit, just needed to install the graphics driver at the start when I installed Ubuntu, and then follow the CUDA install steps in the readme.

That's a good idea. I want to use it as a reference for performance loss. What GPU are you using? I use rtx2070super under wsl2, and the consumption under the default output: 0.41919994354248 sec

dark-swordsman commented 3 years ago

I have an RTX 2070. Specifically the Zotac Mini which has the 400-A1 chip and is limited to 200W, instead of the more powerful 400A-A1.

As you can see, it did it in 0.22278690338135 seconds. I can run the default test a few more times tomorrow to get a larger dataset if you want.

After seeing another image take 0.4-1.2 seconds depending on the settings, it made me worried about total performance and if the native batching method improves performance. I was hoping to try to upscale/refresh anime, but that would put the render time of all the frames at about 4-12 hours per 24 minute episode. That's only upscaling/denoising and doesn't account for additional encoding time.

CarbonPool commented 3 years ago

I have an RTX 2070. Specifically the Zotac Mini which has the 400-A1 chip and is limited to 200W, instead of the more powerful 400A-A1.

As you can see, it did it in 0.22278690338135 seconds. I can run the default test a few more times tomorrow to get a larger dataset if you want.

After seeing another image take 0.4-1.2 seconds depending on the settings, it made me worried about total performance and if the native batching method improves performance. I was hoping to try to upscale/refresh anime, but that would put the render time of all the frames at about 4-12 hours per 24 minute episode. That's only upscaling/denoising and doesn't account for additional encoding time.

Wow, thanks, I did not expect the performance loss so much, I currently train the model under wsl2.

CarbonPool commented 3 years ago

I also completed the installation of dual systems. In this comparative test, I found that wsl2 has great potential. They have the same configuration. Here are some test information:

(Non-standard test)

·Ubuntu20.04 test2 MEM: 15GB PC POWER: ~330w

·windows10 + wsl2 test1 MEM: ~24GB PC POWER: ~290w

wsl2 did not make full use of the GPU, but it proved enough to have great potential.

autonomous1 commented 3 years ago

After encountering some install issues, torch and waifu2x was up and running in Ubuntu 20.04 with the following configuration:

Install torch-cuda-10: git clone https://github.com/nagadomi/distro.git torch-cuda-10 --recursive

Switch to version 8 of gcc and g++: sudo ln -fs /usr/bin/g++-8 /usr/bin/g++ sudo ln -fs /usr/bin/gcc-8 /usr/bin/gcc

Build torch-cuda-10: cd torch-cuda-10 ./clean.sh ./update.sh

Activate torch-cuda (and optionally place in ~/.bashrc): ./install/bin/torch-activate

Run tests: ./test.sh