ctongfei / nexus

Experimental tensor-typed deep learning
https://tongfei.me/nexus/
MIT License
256 stars 15 forks source link

torch backend build failed #24

Closed doofin closed 5 years ago

doofin commented 5 years ago

when building torch backend it gives :

Preprocessing C headers...
In file included from torch.h:1:
TH/TH.h:4:10: fatal error: TH/THGeneral.h: No such file or directory
 #include "TH/THGeneral.h"
          ^~~~~~~~~~~~~~~~
compilation terminated.
Generating SWIG bindings...
Language subdirectory: java
Search paths:
   ./
   ./swig_lib/java/
   /usr/share/swig/3.0.12/java/
   ./swig_lib/
   /usr/share/swig/3.0.12/
Preprocessing...
Starting language-specific parse...
Processing types...
C++ analysis...
Processing nested classes...
Generating wrappers...
Compiling SWIG generated JNI wrapper code...
In file included from /usr/lib/python3.7/site-packages/torch-1.0.0-py3.7-linux-x86_64.egg/torch/lib/include/THC/THCGeneral.h:12,
                 from /usr/lib/python3.7/site-packages/torch-1.0.0-py3.7-linux-x86_64.egg/torch/lib/include/THC/THC.h:4,
                 from torch_wrap_fixed.cxx:238:
/usr/lib/python3.7/site-packages/torch-1.0.0-py3.7-linux-x86_64.egg/torch/lib/include/ATen/cuda/CUDAStream.h:6:10: fatal error: cuda_runtime_api.h: No such file or directory
 #include "cuda_runtime_api.h"
          ^~~~~~~~~~~~~~~~~~~~
compilation terminated.

torch : /usr/lib/python3.7/site-packages/torch-1.0.0-py3.7-linux-x86_64.egg/

ctongfei commented 5 years ago

Can you print out what files do you have in your /usr/lib/python3.7/site-packages/torch-1.0.0-py3.7-linux-x86_64.egg/lib?

Additionally there is a bug with SWIG with Java on x64 machines -- you should patch this https://github.com/ctongfei/nexus/blob/master/torch/swig-patch/fix-long.patch to your SWIG before attempting the build.

And it seems that you don't have CUDA installed.

This torch backend is currently a work-in-progress -- lots of methods are implemented as ??? now so please be patient until I update.

ctongfei commented 5 years ago

When everything is complete you don't need to build it yourself -- the binding will be packed into the jar and you just need to add a dependency.

doofin commented 5 years ago
 ls /usr/lib/python3.7/site-packages/torch-1.0.0-py3.7-linux-x86_64.egg/torch/lib/include

ATen  c10  caffe2  pybind11  TH  THC  THCUNN  torch

I don't have GPU and cuda,Maybe the directory for new torch version has changed? Thanks for your efforts! It seems there is currently no usable auto diff framework in Scala ecosystem (not sure if dl4j implements autodiff, a project called scorch needs manual diff)

doofin commented 5 years ago

BTW,for the facade generation,JNA and JNAerator is also an option

ctongfei commented 5 years ago

If you don't have CUDA, you can try to modify the build.sh script to make it work (by removing dependencies to CUDA). This script works under my current environment (Ubuntu 18.04, CUDA 10.0, PyTorch 1.0).

You can just play with the nexus-diff autodiff package -- it is already usable (not complete, a lot of ops unimplemented) with the nexus-jvm-backend backend (just very slow, but it works). See the example XOR code.

JNA and JNAerator is much slower than JNI (generated by SWIG). We'll use JNI in this project.

Contributions are welcome!

ctongfei commented 5 years ago

Identical to #27 . Should be fixed now.