dlt-hub / dlt

data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
https://dlthub.com/docs
Apache License 2.0
2.3k stars 150 forks source link

install `streamlit` dependency on `dlt pipeline show` #158

Open rudolfix opened 1 year ago

rudolfix commented 1 year ago

Background Make it easy for the user to use dlt .. show command by optionally adding required dependencies

Tasks

    • [ ] make another extra so you can install streamlit with python-dlt[streamlit]
    • [ ] when running a dlt command, handle MissingImportException and optionally (ask) install the dependencies. Then rerun the command.
    • [ ] do not offer installation if dlt is run outside of virtual env. display a warning instead
TyDunn commented 1 year ago
Screenshot 2023-02-27 at 17 23 58

I'm stuck in a loop right now, where it keeps telling me to pip install pandas, even after I have done that

rudolfix commented 1 year ago

@TyDunn is it hard for you to switch to python 3.10 and see if this still happens? I can get back to this task end of this week

TyDunn commented 1 year ago

Have the same problem with 3.10.10:

Screenshot 2023-03-01 at 11 17 12
TyDunn commented 1 year ago

Also tried 3.9 and had the same problem too

rudolfix commented 1 year ago

where the problem is coming from:

  1. dlt was pip installed into the main/global python environment
  2. then a virtual environment was created and activated
  3. now the dlt command is run but still from the global environment. it does not see pandas/streamlit so it complains
  4. the user uses pip install but in the activated virtual environment so pandas is installed into it
  5. the dlt ran again in global environment still does not see it so it complains and we are in a loop

How to solve it:

  1. uninstall dlt from global environment, activate virtual environment and install dlt into it

How to fix library:

  1. warn the user when dlt runs from global environment