Open znicholls opened 1 year ago
There is a commit here #214 that may/may not be related
@mikapfl suggested that we also expose a ScmRun object with some known data. It can then be used in the docs etc
@znicholls I'd also propose we flip the logic to be similar to pandas etc. Instead of "allow_unordered" the option would be "check_order" (defaults to False instead of True?) or perhaps "check_like" which is the name pandas uses.
The check_ts_names
parameter isn't really used. It could be replaced with a more general "check_exact" option which checks that everything is exactly the same, metadata, data ordering, values are identical, data indices are the same. Otherwise it could be dropped.
Also I'd propose adding a check_exact_units
option that defaults to False. That means by default the check is if units are pint equivalent; otherwise, the units are compared as strings (current default).
How about an initial signature of:
def assert_scmrun_almost_equal(
left: BaseScmRun,
right: BaseScmRun,
check_order: bool = False,
check_exact: bool = False,
check_exact_units: bool = False,
rtol: float = 1.0e-5,
atol: float = 1.0e-8,
) -> None:
...
We could also leave old function as is with an additional deprecation warning and add the new implementation separately. That way, it isn't a breaking change
Copying pandas' names as much as possible will reduce cognitive overload I think so I would go with that.
I don't really understand why we have any idea of ordering. Our data model doesn't make any guarantees about ordering does it? So why would our tests imply that is something we support?
Given the above, I'd pull out check_order
and go with something like (which also includes the learnings from #214)
def assert_scmrun_almost_equal(
left: BaseScmRun,
right: BaseScmRun,
rtol: float = 1.0e-5,
atol: float = 1.0e-8,
equal_nan: bool = False,
check_exact_units: bool = False,
time_axis: str | None = None,
# this gives the user full control to use whatever pandas functionality they want
# in the timeseries comparison step
assert_frame_equal_kwargs: dict[str, any] | None = None:
) -> None:
We could also leave old function as is with an additional deprecation warning and add the new implementation separately. That way, it isn't a breaking change
Very smart, let's do that.
Is your feature request related to a problem? Please describe.
The
assert_scmdf_almost_equal
function is a bit messier and harder to use than is necessary.Describe the solution you'd like
We clean up the function. Immediate things to do:
assert_scmrun_almost_equal
allow_unordered
andcheck_ts_names
arguments are still relevant. If they are, try to better capture what they doBaseScmRun
pdt.assert_frame_equal
and we can removenpt.assert_allclose
(in theory, the pandas function also uses assert all close so this feels like it should be possible, we may have issues with the time point comparisons of course)Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
This would be a breaking change, so would need appropriate deprecation warnings (probably over at least two minor releases)