fugue-project / fugue

A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewrites.
https://fugue-tutorials.readthedocs.io/
Apache License 2.0
1.92k stars 94 forks source link

[FEATURE] Create Fugue pytest fixtures and plugins #504

Closed goodwanghan closed 6 months ago

goodwanghan commented 10 months ago

Projects including Fugue itself needs to test with different backends. The current way to test is to manually create Spark, Dask, Ray and Duckdb sessions, test and stop them. In order to get good unit testing performance, it's actually tricky to setup those backends. Another issues is, if there is session level pytest fixture, then some sessions will be started and stopped in each single test, adding significant overhead.

So, we will create fixtures for Fugue and related projects. As the first release, we will have these hard coded fixtures (with the best possible performance for unit tests)

To use them, we just need to pip install fugue with the specific extra, then here is an example:

import fugue.api as fa

def test_my_app(fugue_dask_client):
    with fa.engine_context(fugue_dask_client):
        """test"""

Currently, for spark backend, please keep using pytest-spark and spark_session provided by the package.