Closed dwjang closed 6 years ago
would need an example that is only pandas and fyi this is revamped in forthcoming 0.23
Is it achievable in 0.20.3?
you can do this in 0.23.0
if prior versions, don't use a pd.Series
wrapper
In [13]: df = pd.DataFrame({'A': [1,2,3,4],
...: 'B': [1,2,3,4],
...: 'C': [1,2,3,4],
...: 'D': [1,2,3,4]},
...: index=[0, 1, 2, 3])
...: df.apply(lambda x: [x["A"], x["B"]], axis=1, result_type='reduce')
Out[13]:
0 [1, 1]
1 [2, 2]
2 [3, 3]
3 [4, 4]
dtype: object
I need to preserve the object, "DenseVector" which is a requited object for MLlib input type. Your prescription won't work.
Problem description
This produces from pandas 0.20.0:
but it is different in pandas 0.20.3:
How can I achieve the first behavior in 0.20.3? I am not saying the current behavior is worse than the old one. I don't see any description in "What's New". I need the old behavior to work with DenseVector in PySpark MLlib. Thanks,