Maria-Liakata-NLP-Group / long

"LoNG" is a web user interface for performing longitudinal NLP analysis.
2 stars 0 forks source link

Bug: prep_gptchat_by_user_source fails for unknown reason #64

Open andrewphilipsmith opened 1 year ago

andrewphilipsmith commented 1 year ago

Expected behaviour

Each call to source.get_aggregation should return a new independent dataframe. The order of the tests should not matter.

How to reproduce

As a minimal example, this example passes:

for thread_id in [211, 97, 41]:
    by_thread_all = source.get_aggregation(
        entity_group_by="thread_id",
        time_grouper=day_grouper,
        entity_permitted_values=[thread_id],
    )

However, by simply reordering the list, the example fails:"

for thread_id in [97, 41, 211]:
    by_thread_all = source.get_aggregation(
        entity_group_by="thread_id",
        time_grouper=day_grouper,
        entity_permitted_values=[thread_id],
    )

Actual Result

The second example above results in ValueError: all keys need to be the same shape which is raised by the underlying grouped.count() method.

andrewphilipsmith commented 1 year ago

Currently test_prep_gptchat_by_user_source is marked with @pytest.mark.xfail