ihmeuw-msca / pyDisagg

pydisagg is a Python package for disaggregating estimated count observations across groups under generalized proportionality assumptions.
BSD 2-Clause "Simplified" License
1 stars 0 forks source link

Check if elements in cat_group are unique #81

Closed saalUW closed 1 month ago

saalUW commented 1 month ago

When parsing data, check if cat_group elements are unique

saalUW commented 1 month ago

This could be a solution:

import pandas as pd

Sample DataFrame

data = {"lists": [[1, 2, 3], [7, 43], [1, 2, 1], [10, 11, 12], [8, 8]]}

df = pd.DataFrame(data)

Function to check if all elements in the list are unique

def is_unique(lst): return len(lst) == len(set(lst))

Apply the function to the DataFrame

df["is_unique"] = df["lists"].apply(is_unique)

Filter rows where the list has duplicates (non-unique)

non_unique_rows = df[df["is_unique"] == False]

Display non-unique rows

print(non_unique_rows)