Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
...
File "/opt/venv/lib64/python3.6/site-packages/pandas/io/pytables.py", line 1042, in put
errors=errors,
File "/opt/venv/lib64/python3.6/site-packages/pandas/io/pytables.py", line 1709, in _write_to_group
data_columns=data_columns,
File "/opt/venv/lib64/python3.6/site-packages/pandas/io/pytables.py", line 4143, in write
data_columns=data_columns,
File "/opt/venv/lib64/python3.6/site-packages/pandas/io/pytables.py", line 3813, in _create_axes
errors=self.errors,
File "/opt/venv/lib64/python3.6/site-packages/pandas/io/pytables.py", line 4800, in _maybe_convert_for_string_atom
for i in range(len(block.shape[0])):
TypeError: object of type 'int' has no len()
Problem description
After initial creation of DataFrame the dtype is of object dtype. After putting float in the a column I would expect that the dtype of the a column will change to float64 dtype, but it remains object dtype. The problem is that the type of df.loc[0, "a"] is float during saving the DataFrame, which causes the problem pasted above.
Expected Output
I would expect one of the following:
Implicit conversion of the column to float dtype
Conversion during hdf.put()
Proper exception saying that I am saving mixed typed column
There's a pretty big chance that I am wrong and this is expected behaviour. If that's the case, please, can you explain me why, or point me to somewhere, so that I can read something about it?
[x] I have checked that this issue has not already been reported.
[x] I have confirmed this bug exists on the latest version of pandas.
[ ] (optional) I have confirmed this bug exists on the master branch of pandas.
Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.
Code Sample, a copy-pastable example
This causes following error:
Problem description
After initial creation of DataFrame the dtype is of
object
dtype. After putting float in thea
column I would expect that the dtype of thea
column will change tofloat64
dtype, but it remainsobject
dtype. The problem is that the type ofdf.loc[0, "a"]
isfloat
during saving the DataFrame, which causes the problem pasted above.Expected Output
I would expect one of the following:
float
dtypehdf.put()
There's a pretty big chance that I am wrong and this is expected behaviour. If that's the case, please, can you explain me why, or point me to somewhere, so that I can read something about it?
Maybe it's linked with this issue #34274
Output of
pd.show_versions()