Open rsundqvist opened 8 months ago
The function below works but is limited.
import pandas as pd from id_translation.offline import TranslationMap def translate_as_categories(df: pd.DataFrame, tmap: TranslationMap) -> pd.DataFrame: from id_translation.dio import resolve_io dtypes = { # sort_index() to ensure ordering by ID column: pd.CategoricalDtype(pd.Series(tmap[column]).sort_index(), ordered=True) for column in df } return resolve_io(df).insert(df, names=list(df), tmap=tmap, copy=False).astype(dtypes)
Not very convenient though, and requires some knowledge of internal id_translation types.
id_translation
Setup
>>> data = {1999: "Sofia", 1991: "Richard"} >>> from id_translation import Translator >>> translator = Translator({"people": data}) >>> translator Translator(online=False: cache=TranslationMap('people': 2 IDs))
Create data
>>> df = pd.Series(list(data)).to_frame("people") >>> df = df.sample(4, replace=True).reset_index(drop=True) >>> df.T people 1999 1999 1991 1999
Apply
>>> df = translate_as_categories(df, translator.cache)
Result
>>> df.T people 1999:Sofia 1999:Sofia 1991:Richard 1999:Sofia >>> df["people"].dtype CategoricalDtype(categories=['1991:Richard', '1999:Sofia'], ordered=True, categories_dtype=object)
Maybe it's enough to put up at documentation/examples.
Issues with naïve solution
The function below works but is limited.
Not very convenient though, and requires some knowledge of internal
id_translation
types.Setup
Create data
Apply
Result
Maybe it's enough to put up at documentation/examples.