eaplatanios / tensorflow_scala

TensorFlow API for the Scala Programming Language
http://platanios.org/tensorflow_scala/
Apache License 2.0
937 stars 95 forks source link

Add Examples #23

Closed Mageswaran1989 closed 7 years ago

Mageswaran1989 commented 7 years ago

Hi,

I am working on adding some examples inline with Python API.

While doing so I thought of using Jupyter notebook backed by Toree (https://toree.apache.org/). This is one of the kernel which supports both Scala and Python in my exploration Eg: https://github.com/apache/incubator-toree/blob/master/etc/examples/notebooks/magic-tutorial.ipynb

However I am facing some of challenges:

Notebook is @ https://github.com/iaja/tensorflow_scala/blob/add_examples/examples/Basics.ipynb

Environment:

Also if you could give some overview of the architecture. Something like: https://github.com/tensorflow/tensorflow/issues/9150

I can see how I will be helpfull to the Scala community. I have fair experience on Scala, Java-JNI and C++.

Mageswaran1989 commented 7 years ago

@eaplatanios on further investigation, I found that only the project "api" is not able to assembled. The error I mentioned is due to linker caused by library order while linking. Something like https://stackoverflow.com/questions/17322885/jni-using-symbol-lookup-error.

IMO I think sbt needs to be told to include libtensorflow.so explicitly for java to consider it while assembling.

do you have any idea how to do that in SBT?

eaplatanios commented 7 years ago

@Mageswaran1989 I'm really sorry I didn't get to this earlier. Thank you very much for the work you've done so far. The reason I didn't get to it was that I was both traveling and I was busy with work. However, I've been also working on some big changes in the library and I wanted to look into the examples code you provide after those changes are in. A lot of them are now in and I can answer your questions and provide some potentially helpful direction.

Regarding your SBT issues, I haven't used the assembly plugin but I've been working on the packaging and release process of the library. Now, you can optionally load pre-compiled versions of the TensorFlow native library using SBT. The instructions for that are provided here. This may help with your assembly problem, but can you give it a try and let me know what you find? I'll be more responsive this time. :)

Regarding the examples you wrote, would you like to check if they are compatible with the API changes I've made over the past month. When they're in good shape, I can look into integrating them in the documentation webpage that I'm currently starting to work on.

How does that sound? In any case, thanks for giving this library a try and contributing to its development. :)

Mageswaran1989 commented 7 years ago

@eaplatanios You are doing a great job! I couldn't find much time to explore your work.

I actually wanted to use notebook for examples and compare Tensorflow Python API and Scala API.

I was able to use https://github.com/jupyter-scala/jupyter-scala with bundled jar files. Still I am searching for a way to use Toree some how since it supports both Python and Scala under one roof.

I happened to see your above link just now :( I think that should speed up things a little. I will revisit Toree and try using it with sonatype repository for downloading and keep you updated.

Mageswaran1989 commented 7 years ago

Current version of Toree depends only on Scala version 2.11.8, where as I see only support for 2.12.

Will you be able to cross compile it for 2.11 and upload to sonatype repo?

Mageswaran1989 commented 7 years ago

Update: I was able to use the bundled jar locally build for 2.11 with Toree along side TensorFlow Python.

If you add support for 2.11, the process will be more easy, however Toree will force people to install spark eventhough they dont need it.

Options:

  1. Plain Scala files
  2. Using jupyter-scala
  3. Toree (Python + Scala)

For complex models we have to stick to option 1, leaving other two to your opinion.

Toree steps as follows:


# Option 1: Download Spark binary and extract to known location and update the same in SPARK_HOME
# Link: https://spark.apache.org/downloads.html 
# Option 2: use following commands to build from scratch (not recomended)
cd /opt/
wget http://d3kbcqa49mib13.cloudfront.net/spark-2.1.0.tgz
tar -xvzf ./spark-2.1.0.tgz
cd spark-2.1.0/
build/mvn -DskipTests clean package #I am skeptical at this step :)

#Set appropriate path here
export SPARK_HOME=/opt/spark-2.1.0/
export PYTHONPATH=$PYTHONPATH:$SPARK_HOME/python:$SPARK_HOME/python/lib
pip install https://dist.apache.org/repos/dist/dev/incubator/toree/0.2.0/snapshots/dev1/toree-pip/toree-0.2.0.dev1.tar.gz
jupyter toree install

#time being exporting following before starting the jupyter
export SPARK_OPTS="--master local[*] --jars /opt/iaja/tensorflow_scala/examples/examples-assembly-0.1.jar"

jupyter notebook 
'''