datarootsio / skeleton-pyspark

A best-practices first project template that allows you to get started on a new pyspark project
MIT License
13 stars 4 forks source link

Issue No module named 'typer' found while executing poetry spark-submit command #41

Open trp86 opened 2 years ago

trp86 commented 2 years ago

I followed the steps mentioned in HOWTO.rst file and I am getting a error. Below are the steps I followed.

1) Cloned the git hub repo git clone https://github.com/datarootsio/skeleton-pyspark.git 2) Installed poetry 3) Executed poetry --version command codebind@c029ac673028:~/codebase/python/skeleton-pyspark$ poetry --version Poetry version 1.1.12 4) Executed poetry install and it was executed successfully 5) Executed poetry run spark-submit run.py dev and getting the below error.

21/12/24 14:06:04 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Traceback (most recent call last):
  File "/home/codebind/codebase/python/skeleton-pyspark/run.py", line 2, in <module>
    import typer
ModuleNotFoundError: No module named 'typer'
log4j:WARN No appenders could be found for logger (org.apache.spark.util.ShutdownHookManager).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

Can someone please help me here ?

vikramaditya91 commented 2 years ago

Hey @trp86

Thanks for opening this issue.

This looks very much like either the installation of packages listed in poetry did not go well on your end. Can you post the output of the poetry install command?

Alternatively, you can activate the virtual-environment with poetry shell. And then do a pip list. You should see typer in there.

Just FYI, I started a clean EC2 instance and followed these steps and it did work.

  1. curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python -
  2. source $HOME/.poetry/env
  3. sudo yum install git
  4. git clone https://github.com/datarootsio/skeleton-pyspark.git
  5. cd skeleton-pyspark/
  6. sudo yum install python38
  7. poetry install
  8. sudo yum install java-1.8.0-openjdk.x86_64
  9. export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.312.b07-1.amzn2.0.2.x86_64/jre/
  10. poetry run spark-submit run.py dev