Open RayTsui opened 7 years ago
Actually, I am not sure how to use the "sparkdl$ SPARK_HOME=/usr/local/lib/spark-2.1.1-bin-hadoop2.7 PYSPARK_PYTHON=python2 SCALA_VERSION=2.11.8 SPARK_VERSION=2.1.1 ./python/run-tests.sh ", can it be executed at the command line, but it will give "sparkdl$: command not found".
sparkdl$
means your current directory is spark deep learning project
. SPARK_HOME is need by pyspark , SCALA_VERSION and SPARK_VERSION are used to locate the spark-deep-learning-assembly*.jar
.
./python/run-tests.sh will setup enviroment and find all py in python/tests and run them one by one.
you should run command build/sbt assembly
first to make sure assembly jar is ready ,then run SPARK_HOME=/usr/local/lib/spark-2.1.1-bin-hadoop2.7 PYSPARK_PYTHON=python2 SCALA_VERSION=2.11.8 SPARK_VERSION=2.1.1 ./python/run-tests.sh
@RayTsui thank you for reporting the issue!
@allwefantasy thank you for helping out!
In addition, we also have some scripts/sbt-plugins we use to facilitate development process, which we put in https://github.com/databricks/spark-deep-learning/pull/59.
You can try running SPARK_HOME="path/to/your/spark/home/directory" ./bin/totgen.sh
which will generate pyspark (.py2.spark.shell
, .py3.spark.shell
) and spark-shell (.spark.shell
) REPLs.
@allwefantasy Thanks a lot for your answer, actually, as for the command "SPARK_HOME=/usr/local/lib/spark-2.1.1-bin-hadoop2.7 PYSPARK_PYTHON=python2 SCALA_VERSION=2.11.8 SPARK_VERSION=2.1.1 ./python/run-tests.sh", I have few doubts, 1) the value for each config is fixed and common to all envs, or I need to set the value based on my current env, because I install spark via "brew install apache-spark" instead of downloading the spark with its dependency hadoop(e.g., spark-2.1.1-bin/hadoop). In addition, version number for scala and spark is also based on my env? 2) do I need to set env variable "SPARK_HOME=/usr/local/lib/spark-2.1.1-bin-hadoop2.7 PYSPARK_PYTHON=python2 SCALA_VERSION=2.11.8 SPARK_VERSION=2.1.1 " in ~/.bash_profile or I directly run the command "RK_HOME=/usr/local/lib/spark-2.1.1-bin-hadoop2.7 PYSPARK_PYTHON=python2 SCALA_VERSION=2.11.8 SPARK_VERSION=2.1.1 ./python/run-tests.sh" at the prompt.
3) after tentative attempt, I still came cross the errors above.
if you have some suggestions, It will help me a lot.
@phi-dbq Thanks a lot for your response, I will try to what you refer and give necessary feedback.
# This file should list any python package dependencies.
coverage>=4.4.1
h5py>=2.7.0
keras==2.0.4 # NOTE: this package has only been tested with keras 2.0.4 and may not work with other releases
nose>=1.3.7 # for testing
numpy>=1.11.2
pillow>=4.1.1,<4.2
pygments>=2.2.0
tensorflow==1.3.0
pandas>=0.19.1
six>=1.10.0
kafka-python>=1.3.5
tensorflowonspark>=1.0.5
tensorflow-tensorboard>=0.1.6
Or you can just run command to finish this:
pip2 install -r python/requirements.txt
2.Just keep PYSPARK_PYTHON=python2 SCALA_VERSION=2.11.8 SPARK_VERSION=2.1.1
no change. As I have mentioned, these envs are just for locating the assembly jar. The only env you should set is SPARK_HOME. I suggest that you should not configure them in .bashrc which may have side effect in your other program.
step 1:
build/sbt assembly
then you should find the spark-deep-learning-assembly-0.1.0-spark2.1.jar in your-project/target/scala-2.11.
step 2:
SCALA_VERSION=2.11.8 SPARK_VERSION=2.1.1 ./python/run-tests.sh
Also,you can specify the target file to run instead of the all the files which almost take 30m. Like this:
SCALA_VERSION=2.11.8 SPARK_VERSION=2.1.1 ./python/run-tests.sh /Users/allwefantasy/CSDNWorkSpace/spark-deep-learning/python/tests/transformers/tf_image_test.py
TOTAL 234 163 30%
But there still exists some error as follows:
ModuleNotFoundError: No module named 'tensorframes'
I guess that the tensorframes can officially support linux 64, but right now I use the mac OS, is that the issue?
Hello @RayTsui , I have no problem using OSX for development purposes. Can you run first:
build/sbt clean
followed by:
build/sbt assembly
You should see a line that writes:
[info] Including: tensorframes-0.2.9-s_2.11.jar
this indicates that tensorframes is properly included in the assembly jar, and that your problem is rather that the proper assembly cannot be found.
@thunterdb Thanks a lot for your suggestions. I ran the commands, yes I can see the [info] Including: tensorframes-0.2.8-s_2.11.jar. And as you said, my issue is about "List of assembly jars found, the last one will be used: ls: $DIR/spark-deep-learning-master/python/../target/scala-2.11/spark-deep-learning-assembly*.jar: No such file or directory"
I suppose that all related jars are packaged in spark-deep-learning-assembly.jar, but my spark-deep-learning-master-assembly-0.1.0-spark2.1.jar is generated at the path "$DIR/spark-deep-learning-master/target/scala-2.11/spark-deep-learning-master-assembly-0.1.0-spark2.1.jar" instead of "$DIR/spark-deep-learning-master/python/../target/scala-2.11/spark-deep-learning-assembly.jar". And I tried to modified the segment of the run-tests.sh file, but it does not work.
Do you know how to locate the spark-deep-learning-master-assembly-0.1.0-spark2.1.jar?
I follow the instructions: download the project and use build/sbt assembly and then I execute the python/run-tests.sh, but it gives me the following info:
List of assembly jars found, the last one will be used: ls: /Users/lei.cui/Documents/Workspace/DeepLearninginApacheSpark/spark-deep-learning-master/python/../target/scala-2.12/spark-deep-learning-assembly*.jar: No such file or directory
============= Searching for tests in: /Users/lei.cui/Documents/Workspace/DeepLearninginApacheSpark/spark-deep-learning-master/python/tests ============= ============= Running the tests in: /Users/lei.cui/Documents/Workspace/DeepLearninginApacheSpark/spark-deep-learning-master/python/tests/graph/test_builder.py ============= /usr/local/opt/python/bin/python2.7: No module named nose
Actually, after sbt building, it produces the scala-2.11/spark-deep-learning-assembly.jar instead of scala-2.12/spark-deep-learning-assembly.jar. In addition, I installed the python2 at the /usr/local/bin/python2, why it will have /usr/local/opt/python/bin/python2.7: No module named nose.