databricks / koalas

Koalas: pandas API on Apache Spark
Apache License 2.0
3.32k stars 356 forks source link

AttributeError: type object 'InternalFrame' has no attribute 'restore_index' #2157

Closed Sbargaoui closed 3 years ago

Sbargaoui commented 3 years ago

Describe the bug

Hello, I'm encountering the same issue as in #2156

Versions

Appeared when upgraded to koalas 1.8.0

To Reproduce

I'm applying this simple transformation on a koalas df :

import databricks.koalas as ks

# df is created with ks.read_csv
df['col'] = ks.to_datetime(df['col'], format='%Y-%m-%d', errors='coerce')

Since it's lazily executed, the error is triggered when writing or printing the df for example.

Expected behaviour

The exposed bit of code works when rollbacking to koalas 1.5.0

Trace

An exception was thrown from the Python worker. Please see the stack trace below.
Traceback (most recent call last):
File "/spark/python/lib/pyspark.zip/pyspark/worker.py", line 605, in main
process()
File "/spark/python/lib/pyspark.zip/pyspark/worker.py", line 597, in process
serializer.dump_stream(out_iter, outfile)
File "/spark/python/lib/pyspark.zip/pyspark/sql/pandas/serializers.py", line 255, in dump_stream
return ArrowStreamSerializer.dump_stream(self, init_stream_yield_batches(), stream)
File "/spark/python/lib/pyspark.zip/pyspark/sql/pandas/serializers.py", line 88, in dump_stream
for batch in iterator:
File "/spark/python/lib/pyspark.zip/pyspark/sql/pandas/serializers.py", line 248, in init_stream_yield_batches
for series in iterator:
File "/spark/python/lib/pyspark.zip/pyspark/worker.py", line 450, in mapper
result = tuple(f(*[a[o] for o in arg_offsets]) for (arg_offsets, f) in udfs)
File "/spark/python/lib/pyspark.zip/pyspark/worker.py", line 450, in <genexpr>
result = tuple(f(*[a[o] for o in arg_offsets]) for (arg_offsets, f) in udfs)
File "/spark/python/lib/pyspark.zip/pyspark/worker.py", line 110, in <lambda>
verify_result_type(f(*a)), len(a[0])), arrow_return_type)
File "/spark/python/lib/pyspark.zip/pyspark/util.py", line 107, in wrapper
return f(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/databricks/koalas/accessors.py", line 919, in <lambda>
File "/usr/local/lib/python3.7/site-packages/databricks/koalas/groupby.py", line 1375, in rename_output
AttributeError: type object 'InternalFrame' has no attribute 'restore_index'