CODAIT / text-extensions-for-pandas

Natural language processing support for Pandas dataframes.
Apache License 2.0
217 stars 34 forks source link

Fix TensorArray aggregations that produce ndarray of objects #138

Closed BryanCutler closed 3 years ago

BryanCutler commented 3 years ago

TensorArray aggregations were producing a ndarray of TensorArrays for each group. This changes aggs to produce a TensorElement as a scalar, then allow construction of a new TensorArray using an ndarray of the resulting TensorElement objects.

Fixes #124

BryanCutler commented 3 years ago

also checked the print out is correct:

In [3]:     df = pd.DataFrame({ 
   ...:         "a": ["foo", "bar"], 
   ...:         "b": tp.TensorArray(np.array([[1, 2], [3, 4]])) 
   ...:     }) 
   ...:     result = df.groupby("a").aggregate({"b": "sum"}) 
   ...:     print(repr(result["b"].array)) 
   ...:                                                                                                                      
array([[3, 4],
       [1, 2]])

will go ahead and merge since tests are green, cc @frreiss