`reduce_memory_usage()`: `np.float16` Precision and Decimal Rounding

business-science / pytimetk

Time series easier, faster, more fun. Pytimetk.

https://business-science.github.io/pytimetk/

MIT License

696 stars 60 forks source link

`reduce_memory_usage()`: `np.float16` Precision and Decimal Rounding #274

Closed mdancho84 closed 1 year ago

mdancho84 commented 1 year ago

Problem:

np.float16 when applied to inside the tk.reduce_memory_usage() function has odd precision effects.

Example:

import pandas as pd
import pytimetk as tk
import numpy as np

data = {'Group': ['A', 'A', 'B', 'B', 'C', 'C'],
        'Value': [10.8, 15.2, 10.3, 13.9, 5.2, 7.1]}
df = pd.DataFrame(data)

df_result = tk.reduce_memory_usage(df)

df.glimpse()

df_result.glimpse()

We can see that the np.float16 is modifying the decimal point on each of the numeric data.

mdancho84 commented 1 year ago

I've changed the minimum float to float32