When trying to write an integer64 field, I was getting an error due to the presence of missing values. The missing values were in the form of pd.NA, rather than np.nan and they were not being excluded in the serialization.
I made an attempt to fix this and it worked, though might not be the most elegant solution. In the _replace function, I added a new replacement tuple to the list of replacements, very similar to the one that handles the nans:
def _replace(df):
obj_cols = {k for k, v in dict(df.dtypes).items() if v is np.dtype('O')}
other_cols = set(df.columns) - obj_cols
obj_nans = (f'{k}="nan"' for k in obj_cols)
other_nans = (f'{k}=nani?' for k in other_cols)
obj_nas = (f'{k}="<NA>"' for k in obj_cols)
other_nas = (f'{k}=<NA>i?' for k in other_cols)
replacements = [
('|'.join(chain(obj_nans, other_nans)), ''),
('|'.join(chain(obj_nas, other_nas)), ''),
(',{2,}', ','),
('|'.join([', ,', ', ', ' ,']), ' '),
]
return replacements
When trying to write an
integer64
field, I was getting an error due to the presence of missing values. The missing values were in the form ofpd.NA
, rather thannp.nan
and they were not being excluded in the serialization.I made an attempt to fix this and it worked, though might not be the most elegant solution. In the
_replace
function, I added a new replacement tuple to the list of replacements, very similar to the one that handles the nans:Hope this ends up helping someone