In weather-mv we're dividing the Dataset into small chunks that's adding appropriate parallelism in the pipeline. In the next step if we convert those small chunks into pandas Dataframes it would reduce the cost of generating the flat rows as extracting the rows from Dataframe is very fast.
df = ds.to_dataframe().reset_index()
Here ds is a small chunk of a dataset and reset_index() will flatten the dataset chunk into a normalized dataframe.
In
weather-mv
we're dividing the Dataset into small chunks that's adding appropriate parallelism in the pipeline. In the next step if we convert those small chunks into pandas Dataframes it would reduce the cost of generating the flat rows as extracting the rows from Dataframe is very fast.Here
ds
is a small chunk of a dataset andreset_index()
will flatten the dataset chunk into a normalized dataframe.