malexer / pytest-spark

pytest plugin to run the tests with support of pyspark
MIT License
84 stars 30 forks source link

alphabetical order of test files #7

Closed hfwittmann closed 6 years ago

hfwittmann commented 6 years ago

Good package, thank you!

In my tests changing the alphabetical order of test files has an impact.

For example changing the names of the test files in this package to

test_1_spark_context_fixture.py test_2_spark_session_fixture.py

still works, but

test_2_spark_context_fixture.py test_1_spark_session_fixture.py

fails.

Details:

System is MAC OS

uname - a yields Darwin 127.0.0.1 17.5.0 Darwin Kernel Version 17.5.0: Mon Mar 5 22:24:32 PST 2018; root:xnu-4570.51.1~1/RELEASE_X86_64 x86_64

python verision is 3.6

Spark verision is 2.3 spark_home=/usr/local/Cellar/apache-spark/2.3.0/libexec

pytest --version

This is pytest version 3.5.1, setuptools registered plugins: pytest-spark-0.4.4

The error is:

==================================== ERRORS ==================================== _____ ERROR at setup of test_spark_contextfixture ____

@pytest.fixture(scope='session')
def spark_context():
    """Return a SparkContext instance with reduced logging
    (session scope).
    """

    from pyspark import SparkContext
  sc = SparkContext()

/usr/local/Cellar/apache-spark/2.3.0/libexec/python/pyspark/context.py:115: in init SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)


cls = <class 'pyspark.context.SparkContext'> instance = <[AttributeError("'SparkContext' object has no attribute 'master'") raised in repr()] SparkContext object at 0x113cad160> gateway = None, conf = None

@classmethod
def _ensure_initialized(cls, instance=None, gateway=None, conf=None):
    """
        Checks whether a SparkContext is initialized or not.
        Throws error if a SparkContext is already running.
        """
    with SparkContext._lock:
        if not SparkContext._gateway:
            SparkContext._gateway = gateway or launch_gateway(conf)
            SparkContext._jvm = SparkContext._gateway.jvm

        if instance:
            if (SparkContext._active_spark_context and
                    SparkContext._active_spark_context != instance):
                currentMaster = SparkContext._active_spark_context.master
                currentAppName = SparkContext._active_spark_context.appName
                callsite = SparkContext._active_spark_context._callsite

                # Raise error if there is already a running Spark context
                raise ValueError(
                    "Cannot run multiple SparkContexts at once; "
                    "existing SparkContext(app=%s, master=%s)"
                    " created by %s at %s:%s "
                    % (currentAppName, currentMaster,
                      callsite.function, callsite.file, callsite.linenum))

E ValueError: Cannot run multiple SparkContexts at once; existing SparkContext(app=pyspark-shell, master=local[*]) created by getOrCreate

hfwittmann commented 6 years ago

I forgot to say that (in my tests) this generally happens, when sessions tests are alphabetically before context tests.

malexer commented 6 years ago

Fixed, please use the latest version of pytest-spark (0.4.5)