gusutabopb / aioinflux

Asynchronous Python client for InfluxDB
MIT License
159 stars 31 forks source link

Serialization of pd.NA #37

Open goncas23 opened 3 years ago

goncas23 commented 3 years ago

When trying to write an integer64 field, I was getting an error due to the presence of missing values. The missing values were in the form of pd.NA, rather than np.nan and they were not being excluded in the serialization.

I made an attempt to fix this and it worked, though might not be the most elegant solution. In the _replace function, I added a new replacement tuple to the list of replacements, very similar to the one that handles the nans:

def _replace(df):
    obj_cols = {k for k, v in dict(df.dtypes).items() if v is np.dtype('O')}
    other_cols = set(df.columns) - obj_cols
    obj_nans = (f'{k}="nan"' for k in obj_cols)
    other_nans = (f'{k}=nani?' for k in other_cols)
    obj_nas = (f'{k}="<NA>"' for k in obj_cols)
    other_nas = (f'{k}=<NA>i?' for k in other_cols)
    replacements = [
        ('|'.join(chain(obj_nans, other_nans)), ''),
        ('|'.join(chain(obj_nas, other_nas)), ''),
        (',{2,}', ','),
        ('|'.join([', ,', ', ', ' ,']), ' '),
    ]
    return replacements

Hope this ends up helping someone