Closed yashwanthmadaka24 closed 3 years ago
I also have this problem.
Turns out we cannot use it in jupyter notebook. We need to use Azure Databricks to use that module in jupyter notebook.
We can import sparkdl
in jupyter notebook. Yes, if we use pyspark --packages databricks:spark-deep-learning:1.5.0-spark2.4-s_2.11
this then we no need to worry about some necessary deep learning pipeline packages. And we need to have a net connection though.
However, we can also make things all in local. Check this gist, hope it might help. Cheers.
@innat the gist results in a 404. Can you update?
I found a solution from stack overflow (https://stackoverflow.com/questions/55377712/not-able-to-import-sparkdl-in-jupyter-notebook)
instead of using --packages, just use the .config and 'spark.jars.packages' alias within Jupyter Notebook when initiate a spark session. This would download all dependencies from databricks. You can change version to any release that suits your environment spark = (SparkSession .builder .config('spark.jars.packages', 'databricks:spark-deep-learning:1.5.0-spark2.4-s_2.11') .getOrCreate() )
note that this is very difficult to do with local .jar because none of the releases or current github project has parent dependencies built in. I guess databricks want you to use their package, preferably even with Azure, to run sparkdl.
Hope it helps others.
Same issue I'm facing. Below solution didn't work for me.
spark = (SparkSession
.builder
.config('spark.jars.packages', 'databricks:spark-deep-learning:1.5.0-spark2.4-s_2.11')
.getOrCreate()
)
@innat the gist results in a 404. Can you update?
@skeller88 Sorry for being too late, didn't encounter the mail update until now. :(
Update link, Set up of Deep-Learning-Pipelines in Linux based OS.
Please note, I've worked on this a few years ago. I also did a small project using it, Multi-Class Image Classification With Transfer Learning In PySpark. However, over this year I think the framework should update in many ways, please consider that update.
Hi,
I am trying to use this library in jupyter notebook, but I am getting error "no module found".
When I am running the below command
pyspark --packages databricks:spark-deep-learning:1.5.0-spark2.4-s_2.11
I am able to import sparkdl in the spark shell.How can I use it in jupyter notebook?