databricks / spark-deep-learning

Deep Learning Pipelines for Apache Spark
https://databricks.github.io/spark-deep-learning
Apache License 2.0
1.99k stars 494 forks source link

Necessary imports not included in setup.py #242

Open carlosfrutos opened 2 years ago

carlosfrutos commented 2 years ago

Hi,

I'm developing a neural network using Pytorch in a non-databricks cluster to ensure its functionality prior migrating to a databricks cluster.

Since I'm using Pytorch, I don't need Keras or TensorFlow. I installed successfully Horovod and Sparkdl, however, when I try to run the Spark process I found (for now) three consecutive exceptions related to missing dependencies:

    from sparkdl import HorovodRunner
  File "/opt/conda/default/lib/python3.8/site-packages/sparkdl/__init__.py", line 17, in <module>
    from sparkdl.transformers.keras_image import KerasImageFileTransformer
  File "/opt/conda/default/lib/python3.8/site-packages/sparkdl/transformers/keras_image.py", line 16, in <module>
    import keras.backend as K
  File "/opt/conda/default/lib/python3.8/site-packages/keras/__init__.py", line 21, in <module>
    from tensorflow.python import tf2
ModuleNotFoundError: No module named 'tensorflow'
    from sparkdl import HorovodRunner
  File "/opt/conda/default/lib/python3.8/site-packages/sparkdl/__init__.py", line 17, in <module>
    from sparkdl.transformers.keras_image import KerasImageFileTransformer
  File "/opt/conda/default/lib/python3.8/site-packages/sparkdl/transformers/keras_image.py", line 16, in <module>
    import keras.backend as K
ModuleNotFoundError: No module named 'keras'

This one is DEPRECATED!!:

    from sparkdl import HorovodRunner
  File "/opt/conda/default/lib/python3.8/site-packages/sparkdl/__init__.py", line 17, in <module>
    from sparkdl.transformers.keras_image import KerasImageFileTransformer
  File "/opt/conda/default/lib/python3.8/site-packages/sparkdl/transformers/keras_image.py", line 27, in <module>
    from sparkdl.transformers.tf_image import TFImageTransformer
  File "/opt/conda/default/lib/python3.8/site-packages/sparkdl/transformers/tf_image.py", line 18, in <module>
    import tensorframes as tfs
ModuleNotFoundError: No module named 'tensorframes'

On one hand, I don't understand why should I need these dependencies if I'm not going to use them... Shouldn't it be checked and disabled instead of forcing it to be installed?

On the other hand, if those dependencies are unavoidable, they should be included in the setup.py script to avoid having these errors and losing time, since installing Horovod packages in an ephemeral cluster takes a lot of time just to discover that you cannot run the program...

I'm sure I won't have a problem in a Databricks cluster, but I cannot use it yet and that shouldn't be a problem to test HorovodRunner functionality as stated in the warning message when running a program in a non-databricks cluster...

Kind regards