reata / sqllineage

SQL Lineage Analysis Tool powered by Python
MIT License
1.35k stars 240 forks source link

`SQLAlchemy` optional dependency #624

Open sbrugman opened 6 months ago

sbrugman commented 6 months ago

Is your feature request related to a problem? Please describe. v1.5.0 introduced a dependency on SQLAlchemy >= 2.0.0. Unfortunately, this version conflicts which what recent version of Airflow require (<2). I'm not using the SQLAlchemy-based metadata provider.

Describe the solution you'd like The SQLAlchemy could be "feature-gated". In setup.py we could move the dependency to extras_require. The user can then install this dependency using pip install sqllineage[sqlalchemy] or similar. The core dependency is even lighter!


setup(
    name=NAME,
    version=VERSION,
    author="Reata",
    author_email="reddevil.hjw@gmail.com",
    description="SQL Lineage Analysis Tool powered by Python",
    long_description=long_description,
    long_description_content_type="text/markdown",
    url="https://github.com/reata/sqllineage",
    packages=find_packages(exclude=("tests",)),
    package_data={"": [f"{STATIC_FOLDER}/*", f"{STATIC_FOLDER}/**/**/*", "data/**/*"]},
    include_package_data=True,
    classifiers=[
        "Development Status :: 5 - Production/Stable",
        "License :: OSI Approved :: MIT License",
        "Operating System :: OS Independent",
        "Programming Language :: Python :: 3",
        "Programming Language :: Python :: 3.8",
        "Programming Language :: Python :: 3.9",
        "Programming Language :: Python :: 3.10",
        "Programming Language :: Python :: 3.11",
        "Programming Language :: Python :: 3.12",
        "Programming Language :: Python :: Implementation :: CPython",
    ],
    python_requires=">=3.8",
    install_requires=[
        "sqlparse==0.5.0",
        "networkx>=2.4",
        "sqlfluff==3.0.5",
    ],
    entry_points={"console_scripts": ["sqllineage = sqllineage.cli:main"]},
    extras_require={
        "ci": [
            "bandit",
            "black",
            "flake8",
            "flake8-blind-except",
            "flake8-builtins",
            "flake8-import-order",
            "flake8-logging-format",
            "mypy",
            "pytest",
            "pytest-cov",
            "tox",
            "twine",
            "wheel",
        ],
        "sqlalchemy": [
            "sqlalchemy>=2.0.0",
        ],
        "docs": ["Sphinx>=3.2.0", "sphinx_rtd_theme>=0.5.0"],
    },
    cmdclass={"egg_info": EggInfoWithJS},
)

sqllineage/core/metadata/sqlalchemy.py should only be imported when used, so that we get no import errors unless the module is directly imported.

Describe alternatives you've considered

Additional context Upgrading Airflow is not an option since another team is responsible.

reata commented 5 months ago

This decision is made intentionally to not support sqlalchemy v1.x (and as a result, making people suffer a bit).

Airflow and Superset still pin to sqlalchemy v1.x due to the fact that many 3rd party sqlalchemy dialect implementation still uses v1.x. Considering the maintenance effort, I don't want to build a compatible layer regarding sqlalchemy v1.x and v2.x API, though that would be more user-friendly. (this looks like what Airflow will do, see https://github.com/apache/airflow/issues/28723)

At the same time, even though sqlalchemy v2.x is marked as dependency, we're coding in an OOP way that if you don't use sqlalchemy metadata provider, your code will work even with sqlalchemy v1.x installed. (pip might remind you of incompatible dependency, but that's fine).

Let me know if this works before we discuss on the 'optional' approach.

sbrugman commented 5 months ago

I agree with the decision not to support sqlalchemy v1.x, no problem there.

When the build pipeline includes a pip compile (or similar) step like we have, there is no resolution in the current setting. I still think the 'optional' approach is closest to the desired behaviour: if the user would like the sqlalchemy metadata provider, then the version should be >= v2.x

cheunhong commented 4 months ago

Agree on this