Closed skalish closed 4 years ago
i tested casting to object
after creation, and it failed on the upsert to Tamr
This works for me:
df = pd.read_csv("my_file.csv", dtype=object)
dataset.upsert_from_dataframe(df, "my_pk")
This doesn't work:
df = pd.read_csv("my_file.csv")
df = df.astype(object)
dataset.upsert_from_dataframe(df, "my_pk")
I'm not sure what the best solution is though for an existing dataframe.
The
pandas
methodastype(str)
casts all values in a DataFrame to strings, which allows them to be successfully uploaded to Tamr. However, special values likeNaN
and the PythonNone
will be converted into strings (e.g. "NaN"), introducing non-standard nulls to your Tamr dataset.An alternative is casting with
astype(object)
, which will preserve these special values. I'm unsure if there is a downside to this, but I think it is probably a better practice overall.Related to #323, maybe #373