Describe the bug
When using axis=1 to concatinate two cudf dataframes, the output does not match the pandas output when the indices do not match. More specifically, if both dataframes have N rows, and none of the indices match, the output of pandas.concat has length 2N, while that of cudf.concat has length N. This leads to inconsistent behavior between pandas- and cudf-based Dask dataframes.
Steps/Code to reproduce bug
The problem will occur when random float values are used to define the indices of both dataframes:
In [1]: import pandas as pd
...: import cudf
...:
...: d1 = cudf.datasets.randomdata(3, dtypes={"a":float, "ind":float}).set_index("ind")
...: d2 = cudf.datasets.randomdata(3, dtypes={"b":float, "ind":float}).set_index("ind")
...: pd1 = d1.to_pandas()
...: pd2 = d2.to_pandas()
In [2]: cudf.concat([d1, d2], axis=1)
Out[2]:
a b
ind
0.756803 -0.785512 -0.706212
-0.171331 -0.265047 0.458980
0.548449 0.125229 0.871514
In [3]: pd.concat([pd1, pd2], axis=1)
Out[3]:
a b
ind
-0.935737 NaN 0.458980
-0.612283 NaN 0.871514
-0.203626 NaN -0.706212
-0.171331 -0.265047 NaN
0.548449 0.125229 NaN
0.756803 -0.785512 NaN
Expected behavior
The output of pandas.concat is correct in this case (the output dataframe must have more rows than the input dataframes if the indeces do not match).
Environment overview (please complete the following information)
Describe the bug When using
axis=1
to concatinate two cudf dataframes, the output does not match the pandas output when the indices do not match. More specifically, if both dataframes have N rows, and none of the indices match, the output of pandas.concat has length 2N, while that of cudf.concat has length N. This leads to inconsistent behavior between pandas- and cudf-based Dask dataframes.Steps/Code to reproduce bug The problem will occur when random float values are used to define the indices of both dataframes:
Expected behavior The output of
pandas.concat
is correct in this case (the output dataframe must have more rows than the input dataframes if the indeces do not match).Environment overview (please complete the following information)
Environment details
Click here to see environment details
Additional context Note that this comment from #5643 is related