Open gerashegalov opened 2 years ago
@gerashegalov Does this issue request to run all the integration test cases with the config PYSP_TEST_spark_rapids_force_caller_classloader=false?
Sorry, I am confused. Is this a bug report or a feature request? If the latter one, can you elaborate more about the ENVs? like pytest-xdist + local cluster or vanilla IT against standalone cluster, can you share the expected commands to test in different scenarios? and do we need to run all IT cases, or just some specific cases for multiple spark shims?
Also we need some detailed ENV combinations to do the resource planning, thanks
@gerashegalov Does this issue request to run all the integration test cases with the config PYSP_TEST_spark_rapids_force_caller_classloader=false?
yes, it should be parametrized at Jenkinks level because we cannot do at the pytest level. We should try to make this support as generic as possible because there can be more settings like this.
Sorry, I am confused. Is this a bug report or a feature request?
bug in a sense that we left this feature without continuous testing and it was broken by later PRs.
If the latter one, can you elaborate more about the ENVs? like pytest-xdist + local cluster or vanilla IT against standalone cluster, can you share the expected commands to test in different scenarios? and do we need to run all IT cases, or just some specific cases for multiple spark shims?
Also we need some detailed ENV combinations to do the resource planning, thanks
Ideally I would like another instance of all tests we have with the ENV PYSP_TEST_spark_rapids_force_caller_classloader=false injected at whatever cadence the capacity allows but not less than weekly frequency.
Describe the bug At a handful occasions we needed to resort to setting
spark.rapids.force.caller.classloader
to the non-default valuefalse
as a workaround for bugs. However, we only have a smoke test enabled for this configuration. Without a full test pipeline run against this config we ended up incurring a few regressions over time.Steps/Code to reproduce bug Manually run pytest-xdist with pseudo-distributed standalone local-cluster via:
and observe failures like:
Expected behavior We should not have exceptions with supported options. Unfortunately, this is one of the options that can't be solved via test parametrization because this needs to be applied before pytest Spark app is launched. More generally it may be an epic to identify more pre-pytest-launch settings like this.
Environment details (please complete the following information) local-cluster, Standalone cluster anywhere
Additional context
5646