Open j-carson opened 3 years ago
Nice catch @j-carson !
I was not able to reproduce this with the above code unfortunately. Could you please let me know how to reproduce it ? (note: I tried to replace it with warnings.simplefilter("error")
but it does not seem to trigger it as well)
My pandas version is 1.3.2 and python 3.8
Alternatively can you be more precise about the file/line where the warning happens ? Thanks !
Are you sure you can't reproduce? I just created a new environment with miniconda and followed the instructions on the readme. Running "nox" I definitely see two warnings...
=============================== warnings summary ===============================
pytest_steps/tests/test_docs_example_with_harvest.py::test_synthesis_df
pytest_steps/tests/test_steps_harvest.py::test_synthesis
/Users/jlc/steps/python-pytest-steps/.nox/tests-3-9-env-pytest-latest/lib/python3.9/site-packages/pandas/core/frame.py:9126: FutureWarning: merging between different levels is deprecated and will be removed in a future version. (1 levels on the left,2 on the right)
return merge(
I tried on two existing environments with latest version of pandas and could not see this :( I'll try again tomorrow
I tried to paste all my nox output in here, but it was too big. edit to add: it’s in the nox output of the current open PR
No worries. Note that I hacked nox for my projects so that you get a nice log for each job under .nox/_runlogs
so you can access the file corresponding to that specific session in there, if needed.
Also I finally managed to reproduce it :D as you were suggesting, reusing an existing env was not sufficient but creating a new one was ok. This probably relates to a package version difference somewhere.
I'll flatten as you suggest, hoping that this will not have any other side effect..
This is failing with current versions of Pandas:
_____________________ ERROR at setup of test_synthesis_df ______________________
request = <SubRequest 'module_results_df_steps_pivoted' for <Function test_synthesis_df>>
module_results_df = pytest_obj ... accuracy
test_id s...05
score <function test_my_app_bench at 0x7f8eb490c5e0> ... NaN
[12 rows x 7 columns]
@pytest.fixture(scope='function')
def module_results_df_steps_pivoted(request, module_results_df):
"""
A pivoted version of fixture `module_results_df` from pytest_harvest.
In this version, there is one row per test with the results from all steps in columns.
"""
# Handle the steps
module_results_df = handle_steps_in_results_df(module_results_df, keep_orig_id=False)
# Pivot
> return pivot_steps_on_df(module_results_df, pytest_session=request.session)
pytest_steps/plugin.py:32:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
pytest_steps/steps_harvest_df_utils.py:86: in pivot_steps_on_df
return remaining_df.join(one_per_step_df)
/usr/lib64/python3.12/site-packages/pandas/core/frame.py:10730: in join
return merge(
/usr/lib64/python3.12/site-packages/pandas/core/reshape/merge.py:170: in merge
op = _MergeOperation(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <pandas.core.reshape.merge._MergeOperation object at 0x7f8e9c342bd0>
left = pytest_obj ... dataset_param
test_id ... C
test_my_app_bench[C-2] <function test_my_app_bench at 0x7f8eb490c5e0> ... C
[6 rows x 3 columns]
right = step_id train ... score
status duration_ms ....116760 my dataset #C
test_my_app_bench[C-2] passed 0.132810 ... 0.108233 my dataset #C
[6 rows x 7 columns]
how = 'left', on = None, left_on = None, right_on = None, left_index = True
right_index = True, sort = False, suffixes = ('', ''), indicator = False
validate = None
def __init__(
self,
left: DataFrame | Series,
right: DataFrame | Series,
how: JoinHow | Literal["asof"] = "inner",
on: IndexLabel | AnyArrayLike | None = None,
left_on: IndexLabel | AnyArrayLike | None = None,
right_on: IndexLabel | AnyArrayLike | None = None,
left_index: bool = False,
right_index: bool = False,
sort: bool = True,
suffixes: Suffixes = ("_x", "_y"),
indicator: str | bool = False,
validate: str | None = None,
) -> None:
_left = _validate_operand(left)
_right = _validate_operand(right)
self.left = self.orig_left = _left
self.right = self.orig_right = _right
self.how = how
self.on = com.maybe_make_list(on)
self.suffixes = suffixes
self.sort = sort or how == "outer"
self.left_index = left_index
self.right_index = right_index
self.indicator = indicator
if not is_bool(left_index):
raise ValueError(
f"left_index parameter must be of type bool, not {type(left_index)}"
)
if not is_bool(right_index):
raise ValueError(
f"right_index parameter must be of type bool, not {type(right_index)}"
)
# GH 40993: raise when merging between different levels; enforced in 2.0
if _left.columns.nlevels != _right.columns.nlevels:
msg = (
"Not allowed to merge between different levels. "
f"({_left.columns.nlevels} levels on the left, "
f"{_right.columns.nlevels} on the right)"
)
> raise MergeError(msg)
E pandas.errors.MergeError: Not allowed to merge between different levels. (1 levels on the left, 2 on the right)
/usr/lib64/python3.12/site-packages/pandas/core/reshape/merge.py:784: MergeError
=================================== FAILURES ===================================
________________________________ test_synthesis ________________________________
request = <FixtureRequest for <Function test_synthesis>>
fixture_store = OrderedDict({'dataset': OrderedDict({'pytest_steps/tests/test_docs_example_with_harvest.py::test_my_app_bench[A-1-trai...cy': 0.46894857698850767}, 'pytest_steps/tests/test_steps_harvest.py::test_my_app_bench[C-2-score]': ResultsBag:
{}})})
def test_synthesis(request, fixture_store):
"""
Tests that users can create a pivoted syntesis table manually by combining pytest-harvest and pytest-steps.
Note: we could do this at many other places (hook, teardown of a session-scope fixture...)
"""
# Get session synthesis
# - filtered on the test function of interest
# - combined with default fixture store and results bag
results_dct = get_session_synthesis_dct(request, filter=test_synthesis.__module__,
durations_in_ms=True, test_id_format='function', status_details=False,
fixture_store=fixture_store, flatten=True, flatten_more='results_bag')
# We could use this function to perform the test id split here, but we will do it directly on the df
# results_dct = handle_steps_in_results_dct(results_dct, is_flat=True, keep_orig_id=False)
# convert to a pandas dataframe
results_df = pd.DataFrame.from_dict(results_dct, orient='index')
results_df = results_df.loc[list(results_dct.keys()), :] # fix rows order
results_df.index.name = 'test_id'
# results_df.index.names = ['test_id', 'step_id'] # set multiindex names
results_df.drop(['pytest_obj'], axis=1, inplace=True) # drop pytest object column
# extract the step id and replace the index by a multiindex
results_df = handle_steps_in_results_df(results_df, keep_orig_id=False)
# Pivot but do not raise an error if one of the above columns is not present - just in case.
> pivoted_df = pivot_steps_on_df(results_df, pytest_session=request.session)
pytest_steps/tests/test_steps_harvest.py:86:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
pytest_steps/steps_harvest_df_utils.py:86: in pivot_steps_on_df
return remaining_df.join(one_per_step_df)
/usr/lib64/python3.12/site-packages/pandas/core/frame.py:10730: in join
return merge(
/usr/lib64/python3.12/site-packages/pandas/core/reshape/merge.py:170: in merge
op = _MergeOperation(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <pandas.core.reshape.merge._MergeOperation object at 0x7f8e9ce94bc0>
left = algo_param dataset_param
test_id
test_my_app_bench[A-... 1.0 C
test_my_app_bench[C-2] 2.0 C
test_basic NaN NaN
right = step_id train ... -
status duration_ms ... s...8083 ... NaN NaN
test_my_app_bench[C-2] passed 0.110367 ... NaN NaN
[7 rows x 9 columns]
how = 'left', on = None, left_on = None, right_on = None, left_index = True
right_index = True, sort = False, suffixes = ('', ''), indicator = False
validate = None
def __init__(
self,
left: DataFrame | Series,
right: DataFrame | Series,
how: JoinHow | Literal["asof"] = "inner",
on: IndexLabel | AnyArrayLike | None = None,
left_on: IndexLabel | AnyArrayLike | None = None,
right_on: IndexLabel | AnyArrayLike | None = None,
left_index: bool = False,
right_index: bool = False,
sort: bool = True,
suffixes: Suffixes = ("_x", "_y"),
indicator: str | bool = False,
validate: str | None = None,
) -> None:
_left = _validate_operand(left)
_right = _validate_operand(right)
self.left = self.orig_left = _left
self.right = self.orig_right = _right
self.how = how
self.on = com.maybe_make_list(on)
self.suffixes = suffixes
self.sort = sort or how == "outer"
self.left_index = left_index
self.right_index = right_index
self.indicator = indicator
if not is_bool(left_index):
raise ValueError(
f"left_index parameter must be of type bool, not {type(left_index)}"
)
if not is_bool(right_index):
raise ValueError(
f"right_index parameter must be of type bool, not {type(right_index)}"
)
# GH 40993: raise when merging between different levels; enforced in 2.0
if _left.columns.nlevels != _right.columns.nlevels:
msg = (
"Not allowed to merge between different levels. "
f"({_left.columns.nlevels} levels on the left, "
f"{_right.columns.nlevels} on the right)"
)
> raise MergeError(msg)
E pandas.errors.MergeError: Not allowed to merge between different levels. (1 levels on the left, 2 on the right)
/usr/lib64/python3.12/site-packages/pandas/core/reshape/merge.py:784: MergeError
At line 86 of steps_pytest_harvest_utils.py, the columns have a single level index on the left and a two level index on the right. This is causing a pandas deprecation warning.
Test case: insert the following into tests/test_steps_harvest.py at line 64 and run the library test suite.
You could perhaps fix the warning with the flatten_multilevel_columns function, but the column name change might affect existing tests.