[!NOTE]
The text below is a composite of resources from various locations. We'll continue to evolve this and consider putting into its own "docs" page if helpful to folks. You are invited to drop a comment below or "+1" if this does work - and also if it doesn't work for your use case. Thanks!
We have a common request for assist in resolving Python dependency conflicts with other libraries and Python CLI apps. I'm creating this issue to document some of the considerations, work arounds, and best practices.
Preface: The distinction between "apps" and "libraries"
For this discussion, let's say an "app" is anything with a CLI, where "libraries" must be invoked directly from within Python.
PyAirbyte is a library, since the primary way you interact with it is via import airbyte in your Python code.
dbt is an app, because the primary way you interact with it is via the dbt CLI.
This is important, because while all libraries you are using must coexist in the same Python environment, the same is not true for CLI apps. The best practice for CLI apps is to install them in their own virtual environment. While this can often be cumbersome and manual, there are some helpful tools to streamline it.
Best Practice for installing Python CLI Apps
Whenever installing CLI Apps like dbt, the best practice is to create a virtual environment and install the CLI app into its own virtual environment. This provides the most stable experience for the CLI app itself, and also completely decouples those version constraints of the CLI app and the version constraints of libraries you are using on the same workspace or container.
Streamlining CLI App Installation
There are two very good tools to make CLI app installation just as easy (or almost as easy) as normal pip install methods. The below options apply to all Python CLI apps - which includes tools like dbt and harlequin, as well as (optionally) preinstalling Airbyte connectors like airbyte-source-hubspot.
Using pipx
pipx is the original (to my knowledge) and most widely used. In most cases, you can simply run pip install pipx and then pipx install my-tool. The pipx syntax intentionally is as similar as possible to the syntax of pip so that many tools can be installed into their own dedicated virtual environment simply by replacing the word pip with pipx. (pipx now also comes standard on many Python images so you might not need to pre-install it.)
Using uv and uvx
A newer tool called uv has a similar uvx or uv tools command which can be used similarly to pipx. It is newer and faster than pipx, but also less tested because it is (for now) less used.
Common Installation Patterns
Docker-Based Pre-Installs
Some sample Dockerimage code in this comment specifically around pre-installing connectors onto docker images:
The trick that worked in Airflow was to use a Dockerfile that handles the isolation of installing the connectors into their own virtualenvs:
# Pre-install the connnector(s) in their own virtualenv
RUN python -m venv source_github && source source_github/bin/activate &&\
pip install --no-cache-dir airbyte-source-github && deactivate
# ... repeat for other connectors ...
# Test that the executable works and we can find it
RUN source/bin/source-github spec
# Go ahead and install PyAirbyte as usual
RUN python -m venv pyairbyte_venv && source pyairbyte_venv/bin/activate &&\
pip install --no-cache-dir airbyte==0.10.4 && deactivate
If pipx is preinstalled on the image, this is slightly easier:
# pipx handles the virtual-env and auto-adds the connector CLI to PATH:
RUN pipx install airbyte-source-github
RUN pipx install airbyte-source-faker
# Test that the executables work and we can find them on PATH
RUN source-github spec
RUN source-faker spec
# Go ahead and install PyAirbyte as usual
RUN python -m venv pyairbyte_venv && source pyairbyte_venv/bin/activate &&\
pip install --no-cache-dir airbyte==0.10.4 && deactivate
We have a common request for assist in resolving Python dependency conflicts with other libraries and Python CLI apps. I'm creating this issue to document some of the considerations, work arounds, and best practices.
Preface: The distinction between "apps" and "libraries"
For this discussion, let's say an "app" is anything with a CLI, where "libraries" must be invoked directly from within Python.
PyAirbyte
is a library, since the primary way you interact with it is viaimport airbyte
in your Python code.dbt
is an app, because the primary way you interact with it is via thedbt
CLI.This is important, because while all libraries you are using must coexist in the same Python environment, the same is not true for CLI apps. The best practice for CLI apps is to install them in their own virtual environment. While this can often be cumbersome and manual, there are some helpful tools to streamline it.
Best Practice for installing Python CLI Apps
Whenever installing CLI Apps like dbt, the best practice is to create a virtual environment and install the CLI app into its own virtual environment. This provides the most stable experience for the CLI app itself, and also completely decouples those version constraints of the CLI app and the version constraints of libraries you are using on the same workspace or container.
Streamlining CLI App Installation
There are two very good tools to make CLI app installation just as easy (or almost as easy) as normal
pip install
methods. The below options apply to all Python CLI apps - which includes tools likedbt
andharlequin
, as well as (optionally) preinstalling Airbyte connectors likeairbyte-source-hubspot
.Using
pipx
pipx
is the original (to my knowledge) and most widely used. In most cases, you can simply runpip install pipx
and thenpipx install my-tool
. Thepipx
syntax intentionally is as similar as possible to the syntax ofpip
so that many tools can be installed into their own dedicated virtual environment simply by replacing the wordpip
withpipx
. (pipx
now also comes standard on many Python images so you might not need to pre-install it.)Using
uv
anduvx
A newer tool called
uv
has a similaruvx
oruv tools
command which can be used similarly topipx
. It is newer and faster thanpipx
, but also less tested because it is (for now) less used.Common Installation Patterns
Docker-Based Pre-Installs
Some sample
Dockerimage
code in this comment specifically around pre-installing connectors onto docker images:Reported to me by a user:
If
pipx
is preinstalled on the image, this is slightly easier:Installing dbt
Per this discussion: https://github.com/airbytehq/PyAirbyte/issues/441
Slightly more difficult than a normal
pipx
install, because it requires more than one package installed into the same virtual environment:Related Issues: