Open andrewphilipsmith opened 1 year ago
Each call to source.get_aggregation should return a new independent dataframe. The order of the tests should not matter.
source.get_aggregation
As a minimal example, this example passes:
for thread_id in [211, 97, 41]: by_thread_all = source.get_aggregation( entity_group_by="thread_id", time_grouper=day_grouper, entity_permitted_values=[thread_id], )
However, by simply reordering the list, the example fails:"
for thread_id in [97, 41, 211]: by_thread_all = source.get_aggregation( entity_group_by="thread_id", time_grouper=day_grouper, entity_permitted_values=[thread_id], )
The second example above results in ValueError: all keys need to be the same shape which is raised by the underlying grouped.count() method.
ValueError: all keys need to be the same shape
grouped.count()
Currently test_prep_gptchat_by_user_source is marked with @pytest.mark.xfail
test_prep_gptchat_by_user_source
@pytest.mark.xfail
Expected behaviour
Each call to
source.get_aggregation
should return a new independent dataframe. The order of the tests should not matter.How to reproduce
As a minimal example, this example passes:
However, by simply reordering the list, the example fails:"
Actual Result
The second example above results in
ValueError: all keys need to be the same shape
which is raised by the underlyinggrouped.count()
method.